[24.04-linux-nvidia-6.17] Add core support for SiTime SiT9531x DPLL clock driver#477
[24.04-linux-nvidia-6.17] Add core support for SiTime SiT9531x DPLL clock driver#477gobenji wants to merge 23 commits into
Conversation
Ignore: yes Signed-off-by: Jacob Martin <jacob.martin@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/2158201 Properties: no-test-build Signed-off-by: Jacob Martin <jacob.martin@canonical.com>
Signed-off-by: Jacob Martin <jacob.martin@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/2157912 Use the overflow status bit to track overflow on each bandwidth counter read and add the counter size to the correction when overflow is detected. This assumes that only a single overflow has occurred since the last read of the counter. Overflow interrupts, on hardware that supports them could be used to remove this limitation. Cc: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> Tested-by: Gavin Shan <gshan@redhat.com> Tested-by: Zeng Heng <zengheng4@huawei.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Ben Horgan <ben.horgan@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> (backported from commit b353637) [fenghuay: Differences from upstream b353637: - Retains NV long-counter read paths, reset_on_next_read, and T241 now *= 64. - Applies T241 overflow correction * 64 inline in __ris_msmon_read() (moved to mpam_msmon_overflow_val() wrapper in the 9e5afb7 follow-up commit). - Initialize overflow to false. ] Signed-off-by: Fenghua Yu <fenghuay@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Signed-off-by: Brad Figg <bfigg@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2157912 Now that the larger counter sizes are probed, make use of them. Callers of mpam_msmon_read() may not know (or care!) about the different counter sizes. Allow them to specify mpam_feat_msmon_mbwu and have the driver pick the counter to use. Only 32bit accesses to the MSC are required to be supported by the spec, but these registers are 64bits. The lower half may overflow into the higher half between two 32bit reads. To avoid this, use a helper that reads the top half multiple times to check for overflow. Signed-off-by: Rohit Mathew <rohit.mathew@arm.com> [morse: merged multiple patches from Rohit, added explicit counter selection ] Signed-off-by: James Morse <james.morse@arm.com> Cc: Peter Newman <peternewman@google.com> Reviewed-by: Ben Horgan <ben.horgan@arm.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> Tested-by: Gavin Shan <gshan@redhat.com> Tested-by: Zeng Heng <zengheng4@huawei.com> Tested-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Ben Horgan <ben.horgan@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> (backported from commit 9e5afb7) [fenghuay: Partial port; this tree already has long-counter read paths, mpam_msc_read_mbwu_l(), and T241-MPAM-6 workaround. This commit ports per-width overflow correction, OFLOW_STATUS_L overflow detection, and T241 overflow correction scaling via mpam_msmon_overflow_val(). Apply T241-MPAM-6 * 64 scaling to 31/44-bit counters only; skip 63-bit counters for both the counter read and overflow correction. ] Signed-off-by: Fenghua Yu <fenghuay@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Signed-off-by: Brad Figg <bfigg@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2157922 This patch fixes a few issues: 1. To enable MBW total bytes, continue MBWU counter selection after CSU setup. This follows upstream 7.1. 2. Use mbm_cntr_assignable for assignment enablement so occupancy and MBWU sharing a class do not break mbm_L3_assignments. This follows commit 4766d7e https://gitlab.arm.com/linux-arm/linux-bh.git mpam_abmc_v4 3. Always bind picked MBWU to mbm_total_bytes. This follows commit b2bbf3b https://gitlab.arm.com/linux-arm/linux-bh.git mpam_abmc_v4 4. Rebind mbm_total_bytes at monitor init to the resource that owns the MBWU class via resctrl_mon_event_set_resource(), so memory-level MSC platforms expose mbm_total_bytes under mon_MB_* and mbm_MB_assignments. This fix is necessary when only MBW total bytes (i.e. no MBW local bytes) is supported. Signed-off-by: Fenghua Yu <fenghuay@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Signed-off-by: Brad Figg <bfigg@nvidia.com>
… events BugLink: https://bugs.launchpad.net/bugs/2157922 When mbm_total_bytes is exposed on MBA for memory-level MSC monitors, resctrl_arch_mon_capable() must reflect mon_capable on the backing resource, not only L3. Also gate MBM counter auto-assign and unassign on each enabled event's resource so MBA-backed mbm_total_bytes gets ABMC setup on group create and cleanup on group delete. Signed-off-by: Fenghua Yu <fenghuay@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Signed-off-by: Brad Figg <bfigg@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2158038 When running in interval mode every third count of a time event isn't showing properly: ``` $ perf stat -e duration_time -a -I 1000 1.001082862 1,002,290,425 duration_time 2.004264262 1,003,183,516 duration_time 3.007381401 <not counted> duration_time 4.011160141 1,003,705,631 duration_time 5.014515385 1,003,290,110 duration_time 6.018539680 <not counted> duration_time 7.022065321 1,003,591,720 duration_time ``` The regression came in with a different fix, found through bisection, commit 68cb156 ("perf tool_pmu: Fix aggregation on duration_time"). The issue is caused by the enabled and running time of the event matching the old_count's and creating a delta of 0, which is indicative of an error. Fixes: 68cb156 ("perf tool_pmu: Fix aggregation on duration_time") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> (backported from commit bdf96c4) [mochs: Minor context adjustment due to absent definitions] Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Nirmoy Das <nirmoyd@nvidia.com> Signed-off-by: Brad Figg <bfigg@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2156928 This patch updates the lan743x driver to prevent the use of netdev-based logging APIs (such as netdev_dbg) before the network device has been successfully registered. Using netdev-based logging prior to registration results in log messages referencing "(unnamed net_device) (uninitialized)", which can be confusing and less informative. The driver must use netif_msg_ APIs and device-based logging (e.g. dev_dbg) until netdev registration is complete. This ensures log entries are associated with the correct device context and improves log clarity. After registration, netdev-based logging APIs can be used safely. Signed-off-by: David Thompson <davthompson@nvidia.com> Link: https://patch.msgid.link/20260528165017.421576-1-davthompson@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> (cherry picked from commit e3c6508) Signed-off-by: David Thompson <davthompson@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Signed-off-by: Brad Figg <bfigg@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2156928 VLAN-tagged interfaces on lan743x devices were previously unreachable via SSH and failed to respond to large ping packets (e.g. "ping -s 1469" given MTU=1500). In these scenarios, "ethtool -S" reports non-zero "RX Oversize Frame Errors". According to Microchip AN2948, the MAC_RX FSE (VLAN field size enforcement) bit determines whether frames with VLAN tags exceeding the base MTU plus tag length are discarded. The driver must set the MAC_RX.FSE bit before setting MAC_RX.RXEN to allow VLAN-tagged frames up to the interface MTU, preventing them from being treated as oversized. As a result, both the base and VLAN-tagged interfaces can use the same MTU without receive errors. Fixes: 23f0703 ("lan743x: Add main source files for new lan743x driver") Signed-off-by: David Thompson <davthompson@nvidia.com> Reviewed-by: Thangaraj Samynathan <Thangaraj.s@microchip.com> Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de> Tested-by: Nicolai Buchwitz <nb@tipi-net.de> # lan7430 on arm64 (RevPi Link: https://patch.msgid.link/20260529210300.433135-1-davthompson@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> (cherry picked from commit 8173d22) Signed-off-by: David Thompson <davthompson@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Signed-off-by: Brad Figg <bfigg@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2143032 cxlmd->endpoint starts as ERR_PTR(-ENXIO) until endpoint port registration links the memdev to a real cxl_port. Treat NULL and error pointers as "endpoint not linked" before dereferencing cxlmd->endpoint in CXL helper paths. Fixes: eb61834 ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation") Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Seth Forshee <sforshee@nvidia.com> Signed-off-by: Brad Figg <bfigg@nvidia.com>
… messages BugLink: https://bugs.launchpad.net/bugs/2158328 The current MSI-X restoration path assumes the Command register Memory bit is enabled when writing MSI-X messages. But its possible the last saved and restored state of device may not have the Memory bit enabled, even if a device driver later enables Memory bit and MSI-X. Attempting to access Memory space without Memory bit enabled can lead to Unsupported Request (UR) from the device. Fix this by enabling Memory bit and restore it afterwards. Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Thomas Gleixner <tglx@kernel.org> (cherry picked from https://lore.kernel.org/all/20260622171840.1618-5-alifm@linux.ibm.com/) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Seth Forshee <sforshee@nvidia.com> Signed-off-by: Brad Figg <bfigg@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2158714 Introduce type-aware kmalloc-family helpers to replace the common idioms for single object and arrays of objects allocation: ptr = kmalloc(sizeof(*ptr), gfp); ptr = kmalloc(sizeof(struct some_obj_name), gfp); ptr = kzalloc(sizeof(*ptr), gfp); ptr = kmalloc_array(count, sizeof(*ptr), gfp); ptr = kcalloc(count, sizeof(*ptr), gfp); These become, respectively: ptr = kmalloc_obj(*ptr, gfp); ptr = kmalloc_obj(*ptr, gfp); ptr = kzalloc_obj(*ptr, gfp); ptr = kmalloc_objs(*ptr, count, gfp); ptr = kzalloc_objs(*ptr, count, gfp); Beyond the other benefits outlined below, the primary ergonomic benefit is the elimination of needing "sizeof" nor the type name, and the enforcement of assignment types (they do not return "void *", but rather a pointer to the type of the first argument). The type name _can_ be used, though, in the case where an assignment is indirect (e.g. via "return"). This additionally allows[1] variables to be declared via __auto_type: __auto_type ptr = kmalloc_obj(struct foo, gfp); Internal introspection of the allocated type now becomes possible, allowing for future alignment-aware choices to be made by the allocator and future hardening work that can be type sensitive. For example, adding __alignof(*ptr) as an argument to the internal allocators so that appropriate/efficient alignment choices can be made, or being able to correctly choose per-allocation offset randomization within a bucket that does not break alignment requirements. Link: https://lore.kernel.org/all/CAHk-=wiCOTW5UftUrAnvJkr6769D29tF7Of79gUjdQHS_TkF5A@mail.gmail.com/ [1] Acked-by: Vlastimil Babka <vbabka@suse.cz> Link: https://patch.msgid.link/20251203233036.3212363-1-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org> (backported from commit 2932ba8) [bpoirier: Conflicts: - include/linux/slab.h Due to the absence of af92793 slab: Introduce kmalloc_nolock() and kfree_nolock(). (v6.18-rc1) -> Adjust context] Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Acked-by: Aya Levin <ayal@nvidia.com> Tested-by: Carolina Jubran <cjubran@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2158714 Currently, the DPLL subsystem assumes that the only supported mode is the one currently active on the device. When dpll_msg_add_mode_supported() is called, it relies on ops->mode_get() and reports that single mode to userspace. This prevents users from discovering other modes the device might be capable of. Add a new callback .supported_modes_get() to struct dpll_device_ops. This allows drivers to populate a bitmap indicating all modes supported by the hardware. Update dpll_msg_add_mode_supported() to utilize this new callback: * if ops->supported_modes_get is defined, use it to retrieve the full bitmap of supported modes. * if not defined, fall back to the existing behavior: retrieve the current mode via ops->mode_get and set the corresponding bit in the bitmap. Finally, iterate over the bitmap and add a DPLL_A_MODE_SUPPORTED netlink attribute for every set bit, accurately reporting the device's capabilities to userspace. Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Link: https://patch.msgid.link/20260114122726.120303-2-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> (cherry picked from commit b1f99cc) Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Acked-by: Aya Levin <ayal@nvidia.com> Tested-by: Carolina Jubran <cjubran@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2158714 Currently, userspace can retrieve the DPLL working mode but cannot configure it. This prevents changing the device operation, such as switching from manual to automatic mode and vice versa. Add a new callback .mode_set() to struct dpll_device_ops. Extend the netlink policy and device-set command handling to process the DPLL_A_MODE attribute. Update the netlink YAML specification to include the mode attribute in the device-set operation. Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Link: https://patch.msgid.link/20260114122726.120303-3-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> (backported from commit e3f6c65) [bpoirier: Conflicts: - Documentation/netlink/specs/dpll.yaml Due to the absence of c035e73 dpll: add phase-offset-monitor feature to netlink spec (v6.17-rc1) a680581 dpll: add phase-offset-avg-factor device attribute to netlink spec (v6.18-rc1) -> Adjust context - drivers/dpll/dpll_netlink.c Due to the absence of e28d5a6 dpll: add phase_offset_avg_factor_get/set callback ops (v6.18-rc1) -> Copy nla_for_each_attr() loop from e28d5a6 - drivers/dpll/dpll_nl.c Due to the absence of a680581 dpll: add phase-offset-avg-factor device attribute to netlink spec (v6.18-rc1) -> Adjust context] Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Acked-by: Aya Levin <ayal@nvidia.com> Tested-by: Carolina Jubran <cjubran@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2158714 Extend the DPLL core to support associating a DPLL pin with a firmware node. This association is required to allow other subsystems (such as network drivers) to locate and request specific DPLL pins defined in the Device Tree or ACPI. * Add a .fwnode field to the struct dpll_pin * Introduce dpll_pin_fwnode_set() helper to allow the provider driver to associate a pin with a fwnode after the pin has been allocated * Introduce fwnode_dpll_pin_find() helper to allow consumers to search for a registered DPLL pin using its associated fwnode handle * Ensure the fwnode reference is properly released in dpll_pin_put() Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Link: https://patch.msgid.link/20260203174002.705176-2-ivecera@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> (cherry picked from commit d0f4771) Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Acked-by: Aya Levin <ayal@nvidia.com> Tested-by: Carolina Jubran <cjubran@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2158714 Associate the registered DPLL pin with its firmware node by calling dpll_pin_fwnode_set(). This links the created pin object to its corresponding DT/ACPI node in the DPLL core. Consequently, this enables consumer drivers (such as network drivers) to locate and request this specific pin using the fwnode_dpll_pin_find() helper. Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Link: https://patch.msgid.link/20260203174002.705176-3-ivecera@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> (cherry picked from commit e6dc772) Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Acked-by: Aya Levin <ayal@nvidia.com> Tested-by: Carolina Jubran <cjubran@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2158714 Currently, the DPLL subsystem reports events (creation, deletion, changes) to userspace via Netlink. However, there is no mechanism for other kernel components to be notified of these events directly. Add a raw notifier chain to the DPLL core protected by dpll_lock. This allows other kernel subsystems or drivers to register callbacks and receive notifications when DPLL devices or pins are created, deleted, or modified. Define the following: - Registration helpers: {,un}register_dpll_notifier() - Event types: DPLL_DEVICE_CREATED, DPLL_PIN_CREATED, etc. - Context structures: dpll_{device,pin}_notifier_info to pass relevant data to the listeners. The notification chain is invoked alongside the existing Netlink event generation to ensure in-kernel listeners are kept in sync with the subsystem state. Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Co-developed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Link: https://patch.msgid.link/20260203174002.705176-4-ivecera@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> (backported from commit 2be4675) [bpoirier: Conflicts: - include/linux/dpll.h Due to the absence of 30176bf dpll: add phase-adjust-gran pin attribute (v6.19-rc1) -> Adjust context] Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Acked-by: Aya Levin <ayal@nvidia.com> Tested-by: Carolina Jubran <cjubran@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2158714 Allow drivers to register DPLL pins without manually specifying a pin index. Currently, drivers must provide a unique pin index when calling dpll_pin_get(). This works well for hardware-mapped pins but creates friction for drivers handling virtual pins or those without a strict hardware indexing scheme. Introduce DPLL_PIN_IDX_UNSPEC (U32_MAX). When a driver passes this value as the pin index: 1. The core allocates a unique index using an IDA 2. The allocated index is mapped to a range starting above `INT_MAX` This separation ensures that dynamically allocated indices never collide with standard driver-provided hardware indices, which are assumed to be within the `0` to `INT_MAX` range. The index is automatically freed when the pin is released in dpll_pin_put(). Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Link: https://patch.msgid.link/20260203174002.705176-5-ivecera@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> (cherry picked from commit 711696b) Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Acked-by: Aya Levin <ayal@nvidia.com> Tested-by: Carolina Jubran <cjubran@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2158714 Add parsing for the "mux" string in the 'connection-type' pin property mapping it to DPLL_PIN_TYPE_MUX. Recognizing this type in the driver allows these pins to be taken as parent pins for pin-on-pin pins coming from different modules (e.g. network drivers). Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Link: https://patch.msgid.link/20260203174002.705176-6-ivecera@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> (cherry picked from commit fdad05e) Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Acked-by: Aya Levin <ayal@nvidia.com> Tested-by: Carolina Jubran <cjubran@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2158714 Refactor the reference counting mechanism for DPLL devices and pins to improve consistency and prevent potential lifetime issues. Introduce internal helpers __dpll_{device,pin}_{hold,put}() to centralize reference management. Update the internal XArray reference helpers (dpll_xa_ref_*) to automatically grab a reference to the target object when it is added to a list, and release it when removed. This ensures that objects linked internally (e.g., pins referenced by parent pins) are properly kept alive without relying on the caller to manually manage the count. Consequently, remove the now redundant manual `refcount_inc/dec` calls in dpll_pin_on_pin_{,un}register()`, as ownership is now correctly handled by the dpll_xa_ref_* functions. Additionally, ensure that dpll_device_{,un}register()` takes/releases a reference to the device, ensuring the device object remains valid for the duration of its registration. Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Link: https://patch.msgid.link/20260203174002.705176-7-ivecera@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> (cherry picked from commit 729f5e0) Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Acked-by: Aya Levin <ayal@nvidia.com> Tested-by: Carolina Jubran <cjubran@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2158714 Add support for the REF_TRACKER infrastructure to the DPLL subsystem. When enabled, this allows developers to track and debug reference counting leaks or imbalances for dpll_device and dpll_pin objects. It records stack traces for every get/put operation and exposes this information via debugfs at: /sys/kernel/debug/ref_tracker/dpll_device_* /sys/kernel/debug/ref_tracker/dpll_pin_* The following API changes are made to support this: 1. dpll_device_get() / dpll_device_put() now accept a 'dpll_tracker *' (which is a typedef to 'struct ref_tracker *' when enabled, or an empty struct otherwise). 2. dpll_pin_get() / dpll_pin_put() and fwnode_dpll_pin_find() similarly accept the tracker argument. 3. Internal registration structures now hold a tracker to associate the reference held by the registration with the specific owner. All existing in-tree drivers (ice, mlx5, ptp_ocp, zl3073x) are updated to pass NULL for the new tracker argument, maintaining current behavior while enabling future debugging capabilities. Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Co-developed-by: Petr Oros <poros@redhat.com> Signed-off-by: Petr Oros <poros@redhat.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Link: https://patch.msgid.link/20260203174002.705176-8-ivecera@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> (backported from commit 3c0da10) [bpoirier: Conflicts: - drivers/ptp/ptp_ocp.c Due to the absence of 4c84a5c ptp: ocp: Apply standard pattern for cleaning up loop (v6.19-rc1) -> Adjust context - include/linux/dpll.h Due to the absence of 30176bf dpll: add phase-adjust-gran pin attribute (v6.19-rc1) -> Adjust context] Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Acked-by: Aya Levin <ayal@nvidia.com> Tested-by: Carolina Jubran <cjubran@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/xxtodo The backport of commit 3c0da10 ("dpll: Add reference count tracking support") is needed for the dpll_tracker typedef used in drivers, however the ref tracker feature itself can be disabled, just like it is in the generic Ubuntu 26.04 config. Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Acked-by: Aya Levin <ayal@nvidia.com> Tested-by: Carolina Jubran <cjubran@nvidia.com>
PR Validation ReportPatchscan ✅ No Missing FixesAll cherry-picked commits checked — no missing upstream fixes found. PR Lint ❌ Errors foundDetailsChecking 11 commits...
Cherry-pick digest:
E: 82971631a93f ("dpll: Add reference count tracking suppo"): patch-ID mismatch with upstream 3c0da1030c58
E: a64709a4a613 ("dpll: Add notifier chain for dpll events"): patch-ID mismatch with upstream 2be467588d6b
E: 42e404381539 ("dpll: add dpll_device op to set working "): patch-ID mismatch with upstream e3f6c65192fe
E: b9cd0ede2b0e ("slab: Introduce kmalloc_obj() and family"): patch-ID mismatch with upstream 2932ba8d9c99
┌──────────────┬──────────────────────────────────────────────────────────────────┬────────────┬─────────┬───────────────────────────┐
│ Local │ Referenced upstream / Patch subject │ Patch-ID │ Subject │ SoB chain │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 6860a5574521 │ [SAUCE] [config] set config_dpll_refcnt_tracker=n │ N/A │ N/A │ bpoirier │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 82971631a93f │ 3c0da1030c58 dpll: Add reference count tracking support │ MISMATCH │ match │ preserved + bpoirier adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ f83b2dc67731 │ 729f5e0153bd dpll: Enhance and consolidate reference counting lo │ match │ match │ preserved + bpoirier adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 83325409cd8f │ fdad05ed4ec2 dpll: zl3073x: Add support for mux pin type │ match │ match │ preserved + bpoirier adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 0f58ca93f4fd │ 711696b3e168 dpll: Support dynamic pin index allocation │ match │ match │ preserved + bpoirier adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ a64709a4a613 │ 2be467588d6b dpll: Add notifier chain for dpll events │ MISMATCH │ match │ preserved + bpoirier adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 69274029b9b3 │ e6dc7727b608 dpll: zl3073x: Associate pin with fwnode handle │ match │ match │ preserved + bpoirier adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 546f5b44bb6a │ d0f4771e2bef dpll: Allow associating dpll pin with a firmware no │ match │ match │ preserved + bpoirier adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 42e404381539 │ e3f6c65192fe dpll: add dpll_device op to set working mode │ MISMATCH │ match │ preserved + bpoirier adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ bbfd2e3869a6 │ b1f99cc88638 dpll: add dpll_device op to get supported modes │ match │ match │ preserved + bpoirier adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ b9cd0ede2b0e │ 2932ba8d9c99 slab: Introduce kmalloc_obj() and family │ MISMATCH │ match │ preserved + bpoirier adde │
└──────────────┴──────────────────────────────────────────────────────────────────┴────────────┴─────────┴───────────────────────────┘
Lint: all checks passed.
|
BaseOS Kernel ReviewSummaryNo critical/high issues found; most notable are two Medium concerns in dpll_core.c: call_dpll_notifiers() holds dpll_lock across notifier callbacks, risking self-deadlock if a callback re-registers, and dpll_pin_idx_alloc() can yield U32_MAX, colliding with the DPLL_PIN_IDX_UNSPEC sentinel and causing pin aliasing. The new kmalloc_obj() macros also silently drop __alloc_size/FORTIFY hardening, plus various doc/changelog inaccuracies. Findings: Critical: 0, High: 0, Medium: 4, Low: 7 Latest watcher review: open review Generated test plan: open test plan Kernel deb build: successful (download debs, 4 files) Head: This comment is maintained by nv-pr-bot. It is updated when the GitHub watcher publishes a newer review. |
|
Looking at the review from Boro it seems that we may need two additional follow-up commits to catch potential misuse of dpll_lock: Maybe it's worth including these as well. |
|
Moreover, as also pointed by Boro, do we need to create a LP bug for this? |
|
Nice findings |
|
@gobenji Can you provide more context on these patches? The 6.17-HWE kernel is going EOL soon and we get these for free in 7.0. |
|
Hi @nvmochs Do you know the date for the EOL? Those changes are for our MGX ARC PRO Telecom Platform, which is currently recommended to use the 6.17 kernel. I'm trying to understand if the guidance will be updated to a newer kernel, and I will follow up with you once I have more information. |
362b68d to
aea4c7d
Compare
We plan to roll from 6.17 to 7.0 in Aug. |
Backport core changes needed to support the upcoming SiTime SiT9531x DPLL clock driver [1]. This will allow building the sit9531x driver as an out-of-tree module.
[1] https://lore.kernel.org/all/20260520191943.73938-1-arouhi@sitime.com/
LP: https://bugs.launchpad.net/bugs/2158714
RM: 4986740