Sierra Wireless AirPrime EM919x PCIe support

Hi,

I’m trying to use the EM919x connected to a i.MX6DualLite through the PCIe gen2 bus, but for the moment without success, could you help me ?

I’m following two approaches:

  1. Use the Sierra driver R21 with on the kernel 5.4
  2. Use the upstream MHI driver on the kernel 5.13

QMI and MBIM are exposed but:

  • With the Sierra driver R21, I tried to use commands described in readme to manage the modem using MBIM, without success, commands sent with mbimcli are deposed in the MHI channel, but they seem never consumed by the modem. In addition, the QMI interface doesn’t seem to work at all: [/dev/mhi_0306_00.01.00_pipe_14] couldn't detect transport type of port: unexpected port subsystem.

  • With the upstream MHI driver, I manage to establish communication through the QMI and the MBIM channels but it’s really random: sometimes the requests succeed, other times it ends in timeout or receives an unexpected response…
    Moreover MBIM channels seem only accessible after trying to send a QMI request, and AT commands seem to work correctly.

Moreover, in both approaches, I use the same user space stack:

  • NetworkManager 1.32.4
  • ModemManager 1.18.0
  • libqmi 1.30.2
  • libmbim 1.26.0

In addition, my EM919x seems to be an engineering sample, because the SVID and the SDID are incorrect:

cat /sys/bus/pci/drivers/mhi-pci-generic/0000\:01\:00.0/vendor
     0x17cb
cat /sys/bus/pci/drivers/mhi-pci-generic/0000\:01\:00.0/subsystem_vendor
     0x17cb
cat /sys/bus/pci/drivers/mhi-pci-generic/0000\:01\:00.0/subsystem_device
     0x010c
cat /sys/bus/pci/drivers/mhi-pci-generic/0000\:01\:00.0/device
     0x0306

And the firmware revision is “SWIX55C_00.16.04.00 000000 jenkins 2020/06/02 04:19:42"”. Then I also tried to use the revision “00.16.04.00_Generi_010.003_000”.

1. The Sierra driver R21 approach

To use the Sierra driver R21, I fixed few issues, in order to be able to compile and load the modules correctly:

  • Allowed the cross-compilation,
  • Allowed to use only one MSI vector,
  • Prevent PCIe low power mode.

Because PCIe low power mode is not supported on our platform and it isn’t able to assign eight MSI vectors.

I also identified a possible deadlock:

[  193.428213]  *** DEADLOCK ***
[  193.434138] 1 lock held by mbim-proxy/682:
[  193.438238]  #0: bf092340 (&mhi_uci_drv.lock){+.+.}, at: mhi_uci_open+0x28/0x4b4 [mhiuci]

2. The upstream MHI driver approach

I added a MHI channel mapping for the EM919x in the mhi/pci_generic driver :

+static const struct mhi_channel_config mhi_sierra_em919x_channels[] = {
+       MHI_CHANNEL_CONFIG_UL_SBL(2, "SAHARA", 32, 0),
+       MHI_CHANNEL_CONFIG_DL_SBL(3, "SAHARA", 256, 0),
+       MHI_CHANNEL_CONFIG_UL(4, "DIAG", 32, 0),
+       MHI_CHANNEL_CONFIG_DL(5, "DIAG", 32, 0),
+       MHI_CHANNEL_CONFIG_UL(12, "MBIM", 128, 0),
+       MHI_CHANNEL_CONFIG_DL(13, "MBIM", 128, 0),
+       MHI_CHANNEL_CONFIG_UL(14, "QMI", 32, 0),
+       MHI_CHANNEL_CONFIG_DL(15, "QMI", 32, 0),
+       MHI_CHANNEL_CONFIG_UL(32, "DUN", 32, 0),
+       MHI_CHANNEL_CONFIG_DL(33, "DUN", 32, 0),
+       MHI_CHANNEL_CONFIG_HW_UL(100, "IP_HW0", 512, 1),
+       MHI_CHANNEL_CONFIG_HW_DL(101, "IP_HW0", 512, 2),
+};
+
+static struct mhi_event_config modem_sierra_em919x_mhi_events[] = {
+       /* first ring is control+data and DIAG ring */
+       MHI_EVENT_CONFIG_CTRL(0, 2048),
+       /* Hardware channels request dedicated hardware event rings */
+       MHI_EVENT_CONFIG_HW_DATA(1, 2048, 100),
+       MHI_EVENT_CONFIG_HW_DATA(2, 2048, 101)
+};
+
+static const struct mhi_controller_config modem_sierra_em919x_config = {
+       .max_channels = 128,
+       .timeout_ms = 24000,
+       .num_channels = ARRAY_SIZE(mhi_sierra_em919x_channels),
+       .ch_cfg = mhi_sierra_em919x_channels,
+       .num_events = ARRAY_SIZE(modem_sierra_em919x_mhi_events),
+       .event_cfg = modem_sierra_em919x_mhi_events,
+};
+
+static const struct mhi_pci_dev_info mhi_sierra_em919x_info = {
+       .name = "sierra-em919x",
+       .config = &modem_sierra_em919x_config,
+       .bar_num = MHI_PCI_DEFAULT_BAR_NUM,
+       .dma_data_width = 32,
+       .sideband_wake = true,
+};
+
 static const struct mhi_channel_config mhi_quectel_em1xx_channels[] = {
        MHI_CHANNEL_CONFIG_UL(0, "NMEA", 32, 0),
        MHI_CHANNEL_CONFIG_DL(1, "NMEA", 32, 0),
@@ -377,6 +417,9 @@ static const struct pci_device_id mhi_pci_id_table[] = {
                .driver_data = (kernel_ulong_t) &mhi_quectel_em1xx_info },
        { PCI_DEVICE(PCI_VENDOR_ID_QCOM, 0x0308),
                .driver_data = (kernel_ulong_t) &mhi_qcom_sdx65_info },
+       /* EM919x (sdx55), Both for eSIM and Non-eSIM */
+       { PCI_DEVICE_SUB(PCI_VENDOR_ID_QCOM, 0x0306, 0x18d7, 0x0200),
+               .driver_data = (kernel_ulong_t) &mhi_sierra_em919x_info },
        /* T99W175 (sdx55), Both for eSIM and Non-eSIM */
        { PCI_DEVICE(PCI_VENDOR_ID_FOXCONN, 0xe0ab),
                .driver_data = (kernel_ulong_t) &mhi_foxconn_sdx55_info },
1 Like

Does the new PCI_DEVICE_SUB() line really take precedence over the previous PCI_DEVICE() entry for the same 0x0306 device id? (Just wondering)

Any plan to suggest that to inclusion in the upstream driver?

We would like to upstream it, but for that we would like to arrive at something functional in order to be able to validate these changes.

It should work but I added a hack on top of it because my engineering sample does not offer the right svid, sdid.
I will verify, but I’m sure this configuration is used in my setup.

For information, we made an additional setup using a x86-64 laptop, Fedora 34, Linux 5.11 and the R23 Sierra driver, and followed step by step the readme, then the result is less conclusive:

  • the modem is well enumerated,
  • kernel modules are seems probed successfully, there isn’t explicit error,
  • but logical interface (mbim, at…) aren’t exposted.

What firmware version is the module using? I’m running 02.08.01.00, upgraded using qmi-firmware-update with the module in USB mode on a Linux host.

With your mhi_pci_dev_info entry, I’m getting several firmware crashes reported during the system boot:

[    7.060730] mhi-pci-generic 0000:01:00.0: MHI PCI device found: sierra-em919x
[    7.067906] mhi-pci-generic 0000:01:00.0: BAR 0: assigned [mem
0x600000000-0x600000fff 64bit]
[    7.076455] mhi-pci-generic 0000:01:00.0: enabling device (0000 -> 0002)
[    7.083277] mhi-pci-generic 0000:01:00.0: using shared MSI
[    7.089508] mhi mhi0: Requested to power ON
[    7.094080] mhi mhi0: Attempting power on with EE: PASS THROUGH,
state: SYS ERROR
[    7.180371] mhi mhi0: local ee: INVALID_EE state: RESET device ee:
PASS THROUGH state: SYS ERROR
[    7.187146] mhi mhi0: Power on setup success
[    7.187219] mhi mhi0: Handling state transition: PBL
[    7.189165] mhi mhi0: System error detected
[    7.189178] mhi mhi0: Device MHI is not in valid state
[    7.189189] mhi-pci-generic 0000:01:00.0: firmware crashed (7)
[    7.213682] mhi mhi0: Handling state transition: SYS ERROR
[    7.219183] mhi mhi0: Transitioning from PM state: Linkdown or
Error Fatal Detect to: SYS ERROR Process
[    7.228590] mhi-pci-generic 0000:01:00.0: firmware crashed (6)
[    7.234429] mhi mhi0: Failed to transition from PM state: Linkdown
or Error Fatal Detect to: SYS ERROR Process
[    7.244433] mhi mhi0: Exiting with PM state: Linkdown or Error
Fatal Detect, MHI state: RESET
[    7.252963] mhi mhi0: Handling state transition: DISABLE
[    7.258278] mhi mhi0: Processing disable transition with PM state:
Linkdown or Error Fatal Detect
[    7.267155] mhi mhi0: Waiting for all pending event ring processing
to complete
[    7.274480] mhi mhi0: Waiting for all pending threads to complete
[    7.280576] mhi mhi0: Reset all active channels and remove MHI devices
[    7.287110] mhi mhi0: Resetting EV CTXT and CMD CTXT
[    7.292077] mhi mhi0: Exiting with PM state: DISABLE, MHI state: RESET
[    7.298683] mhi-pci-generic 0000:01:00.0: failed to power up MHI controller
[    7.306184] mhi-pci-generic: probe of 0000:01:00.0 failed with error -110

It’s a bit random though; sometimes the module boots nice and I can use both the QMI and MBIM wwan devices exposed.

I feel I could have more successful boots with the older 01.04.01.02 firmware the module came with, truth be told.

Hi @aleksander0m ,

Until yesterday, we were using the “SWIX55C_00.16.04.00 000000 jenkins 2020/06/02 04:19:42”, and yesterday we got two new modems with a firmware “01.07.08.00_GENERI_016.003_000”.

We observed a few firmware crashes, none since several weeks.

The EM919x doesn’t seem to like the warm reboot much, we observed some strange behavior after system reboots.

Now, we also have a second setup based on x86-64, Fedora 34, Linux 5.11, the Sierra driver R23, the firmware 01.07.08.00 and a PCIe x1 gen3.

For the moment, we have not seen testing with the upstream driver, on this setup.

With the X86-64 setup using the Sierra driver, the qmi interface doesn’t work and mbim command seems to work very well but we fail to attach packet service to establish a cellular connection. (We are able to read rssi, iccid, set the radio …)

On our setup based on i.MX6DL, PCIe x1 gen2, we are two available software solutions, based on Yocto:

  • The first is based on the kernel 5.11 and the driver Sierra R23,
  • The second is based on the kernel 5.13 and the upstream driver with your mhi_pci_dev_info.

And these both use:

  • Libqmi 1.30.2
  • Libmbim 1.26.0
  • Modem Manager 1.18.0
  • Network Manager 1.32.4

With both i.MX6 solutions:

  • Either the kernel spam in loop this error: “mhi_wwan_ctrl mhi0_QMI: Failed to queue buffer”
  • Either some command succeed, then timeout,
  • Received unexpected response or the response to a previous command,
  • All AT commands seem succeed,
  • And the firmware is well updated.

But these random issues seem linked to I.MX6DL MSI vector limitation because when we restrain the number of MSI vectors on the X86-64 setup, we observe same issues.

NB. Yesterday, we also acquired a firmware “03.04.03.00”. We will test it soon. It fixes some major issues on: RF, MSI vector, MHI and MBIM, then adding some QMI objets.

How did you get that? I cannot see that firmware version in https://source.sierrawireless.com/resources/airprime/software/em919x/em9-approved-fw-packages/

We got that firmware version from our distributor, it has been released end of August.

Shouldn’t sideband_wake be false? It looks like it should only be true for the sdx24.

How many are you able to allocate? In my setup I’m using one single shared MSI vector; it tries to allocate 4 (one for BHI plus one per event ring) but only succeed with 1. The MBPL sierra driver attempts to allocate 8 by default always.

Hello @aleksander0m,

Our i.MX6 setup also are using one single shared MSI vector.

On x86-64, with the MBPL Sierra driver 8 share MSI vector are used, and if we limit to 1 we observe same kinds of issues as on i.MX6DL, but to a lesser extent probably because x86-64 provides a PCIe gen3 bus and not gen2 as on i.MX6DL

I will check if “sideband_wake = true” is correct or not.

This is a RaspberryPi CM4 (Broadcom BCM2711), not the good old i.MX6 :wink:

There’s something fishy here; I’ve now tested with the PCI_DEVICE_SUB() line before the generic one for the 0x0306, and also after it (as you have it); and in both cases, it’s not picking the EM919x speciifc config, it’s always reporting MHI PCI device found: qcom-sdx55m.

Until now I’ve also been testing with a hack to make sure the correct one was being picked (changing the default one to 0x03ff and making the EM919x the only one for 0x0306). But if I go back to PCI_DEVICE_SUB(), it never gets picked any more.

My lspci says:

01:00.0 Unassigned class [ff00]: Qualcomm Device 0306
	Subsystem: Device 18d7:0200

And that should be in line with the config you suggested:

{ PCI_DEVICE_SUB(PCI_VENDOR_ID_QCOM, 0x0306, 0x18d7, 0x0200),

I’ll go back to the hack I have to make sure the correct one is picked, but we should probably revisit this sometime…

Hello @aleksander0m ,

Yes, from my side, I also use a hack for the moment, because my EM919x didn’t provide right svid and sdev. To fix this issue it is necessary to use PCI_DEVICE_SUB instead PCI_DEVICE to define the first entry.

Otherwise, svid and sdev of the first entry are set to “any” then this entry matches: pci.h - drivers/pci/pci.h - Linux source code (v6.6.2) - Bootlin

Otherwise, concerning the sideband_wake, my distributor tells me I will received an answer soon.

You mean we also need to use PCI_DEVICE_SUB() for the non-Sierra generic 0x0306 entry? And do we know what the correct svid/sdev would be in that case?

No, it doesn’t. The list is parsed sequentially and a PCI_DEVICE() entry will match any subdevice. See
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/pci-driver.c#n104

Use the source, Luke :slight_smile:

That’s what I assumed when I first read the patch, yes.

And I tested yesterday having the more specific PCI_SUB_DEVICE before the generic PCI_DEVICE entry, and for some reason it didn’t match the PCI_SUB_DEVICE even if listed before, which was weird.

Yes, that’s unexpected. I assume you had a clean dynamic device id list? And no typos? Stupid question, I know. But that’s just the kind of error I would do…