Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] master from torvalds:master #1769

Merged
merged 100 commits into from
Feb 7, 2025
Merged

[pull] master from torvalds:master #1769

merged 100 commits into from
Feb 7, 2025

Conversation

pull[bot]
Copy link

@pull pull bot commented Feb 7, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

keithbusch and others added 30 commits January 14, 2025 07:35
Fixes: 3ec5c62 ("nvmet: handle rw's limited retry flag")
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Cc: Guixin Liu <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
The value of size is 0 when there is no dma buffer allocated. The value
of i also remains 0. So, no need to free the dma buffer in out_free_bufs.
Hence, remove the redundant dma frees.

Signed-off-by: Francis Pravin <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
On the TUXEDO InfinityFlex, a Samsung 990 Evo NVMe leads to a high power
consumption in s2idle sleep (4 watts).

This patch applies 'Force No Simple Suspend' quirk to achieve a sleep with
a lower power consumption, typically around 1.4 watts.

Signed-off-by: Georg Gottleuber <[email protected]>
Cc: [email protected]
Signed-off-by: Werner Sembach <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
On the TUXEDO InfinityBook Pro Gen9 Intel, a Samsung 990 Evo NVMe leads to
a high power consumption in s2idle sleep (4 watts).

This patch applies 'Force No Simple Suspend' quirk to achieve a sleep with
a lower power consumption, typically around 1.2 watts.

Signed-off-by: Georg Gottleuber <[email protected]>
Cc: [email protected]
Signed-off-by: Werner Sembach <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
The initial controller initialization mimiks the reconnect loop
behavior by switching from NEW to RESETTING and then to CONNECTING.

The transition from NEW to CONNECTING is a valid transition, so there is
no point entering the RESETTING state. TCP and RDMA also transition
directly to CONNECTING state.

Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Daniel Wagner <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
When the set feature attempts fails with any NVME status code set in
nvme_set_queue_count, the function still report success. Though the
numbers of queues set to 0. This is done to support controllers in
degraded state (the admin queue is still up and running but no IO
queues).

Though there is an exception. When nvme_set_features reports an host
path error, nvme_set_queue_count should propagate this error as the
connectivity is lost, which means also the admin queue is not working
anymore.

Fixes: 9a0be7a ("nvme: refactor set_queue_count")
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Daniel Wagner <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
When a connectivity loss occurs while nvme_fc_create_assocation is
being executed, it's possible that the ctrl ends up stuck in the LIVE
state:

  1) nvme nvme10: NVME-FC{10}: create association : ...
  2) nvme nvme10: NVME-FC{10}: controller connectivity lost.
                  Awaiting Reconnect
     nvme nvme10: queue_size 128 > ctrl maxcmd 32, reducing to maxcmd
  3) nvme nvme10: Could not set queue count (880)
     nvme nvme10: Failed to configure AEN (cfg 900)
  4) nvme nvme10: NVME-FC{10}: controller connect complete
  5) nvme nvme10: failed nvme_keep_alive_end_io error=4

A connection attempt starts 1) and the ctrl is in state CONNECTING.
Shortly after the LLDD driver detects a connection lost event and calls
nvme_fc_ctrl_connectivity_loss 2). Because we are still in CONNECTING
state, this event is ignored.

nvme_fc_create_association continues to run in parallel and tries to
communicate with the controller and these commands will fail. Though
these errors are filtered out, e.g in 3) setting the I/O queues numbers
fails which leads to an early exit in nvme_fc_create_io_queues. Because
the number of IO queues is 0 at this point, there is nothing left in
nvme_fc_create_association which could detected the connection drop.
Thus the ctrl enters LIVE state 4).

Eventually the keep alive handler times out 5) but because nothing is
being done, the ctrl stays in LIVE state.

There is already the ASSOC_FAILED flag to track connectivity loss event
but this bit is set too late in the recovery code path. Move this into
the connectivity loss event handler and synchronize it with the state
change. This ensures that the ASSOC_FAILED flag is seen by
nvme_fc_create_io_queues and it does not enter the LIVE state after a
connectivity loss event. If the connectivity loss event happens after we
entered the LIVE state the normal error recovery path is executed.

Signed-off-by: Daniel Wagner <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
…trol_type()

powercap_register_control_type() calls device_register(), but does not
release the refcount of the device when it fails.

Call put_device() before returning an error to balance the refcount.

Since the kfree(control_type) will be done by powercap_release(), remove
the lines in powercap_register_control_type() before returning the error.

This bug was found by an experimental verifier that I am developing.

Signed-off-by: Joe Hattori <[email protected]>
Link: https://patch.msgid.link/[email protected]
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
Simply free an allocated buffer once we copied its content
to the request sgl.

kmemleak complaint:
unreferenced object 0xffff8cd40c388000 (size 4096):
  comm "kworker/2:2H", pid 14739, jiffies 4401313113
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace (crc 0):
    [<ffffffff9e01087a>] kmemleak_alloc+0x4a/0x90
    [<ffffffff9d30324a>] __kmalloc_cache_noprof+0x35a/0x420
    [<ffffffffc180b0e2>] nvmet_execute_identify+0x912/0x9f0 [nvmet]
    [<ffffffffc181a72c>] nvmet_tcp_try_recv_pdu+0x84c/0xc90 [nvmet_tcp]
    [<ffffffffc181ac02>] nvmet_tcp_io_work+0x82/0x8b0 [nvmet_tcp]
    [<ffffffff9cfa7158>] process_one_work+0x178/0x3e0
    [<ffffffff9cfa8e9c>] worker_thread+0x2ec/0x420
    [<ffffffff9cfb2140>] kthread+0xf0/0x120
    [<ffffffff9cee36a4>] ret_from_fork+0x44/0x70
    [<ffffffff9ce7fdda>] ret_from_fork_asm+0x1a/0x30

Fixes: 84909f7 ("nvmet: use kzalloc instead of ZERO_PAGE in nvme_execute_identify_ns_nvm()")
Signed-off-by: Sagi Grimberg <[email protected]>
Reviewed-by: Nilay Shroff <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
The header drm_print.h uses members of struct drm_device pointers, as
such, it should include drm_device.h to let the compiler know the full
type definition.

Without such include, users of drm_print.h that don't explicitly need
drm_device.h would bump into build errors and be forced to include the
latter.

Signed-off-by: Gustavo Sousa <[email protected]>
Reviewed-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
The ASTDP transmitter sometimes takes up to 1 second for enabling the
video signal, while the timeout is only 200 msec. This results in a
kernel error message. Increase the timeout to 1 second. An example
of the error message is shown below.

[  697.084433] ------------[ cut here ]------------
[  697.091115] ast 0000:02:00.0: [drm] drm_WARN_ON(!__ast_dp_wait_enable(ast, enabled))
[  697.091233] WARNING: CPU: 1 PID: 160 at drivers/gpu/drm/ast/ast_dp.c:232 ast_dp_set_enable+0x123/0x140 [ast]
[...]
[  697.272469] RIP: 0010:ast_dp_set_enable+0x123/0x140 [ast]
[...]
[  697.415283] Call Trace:
[  697.420727]  <TASK>
[  697.425908]  ? show_trace_log_lvl+0x196/0x2c0
[  697.433304]  ? show_trace_log_lvl+0x196/0x2c0
[  697.440693]  ? drm_atomic_helper_commit_modeset_enables+0x30a/0x470
[  697.450115]  ? ast_dp_set_enable+0x123/0x140 [ast]
[  697.458059]  ? __warn.cold+0xaf/0xca
[  697.464713]  ? ast_dp_set_enable+0x123/0x140 [ast]
[  697.472633]  ? report_bug+0x134/0x1d0
[  697.479544]  ? handle_bug+0x58/0x90
[  697.486127]  ? exc_invalid_op+0x13/0x40
[  697.492975]  ? asm_exc_invalid_op+0x16/0x20
[  697.500224]  ? preempt_count_sub+0x14/0xc0
[  697.507473]  ? ast_dp_set_enable+0x123/0x140 [ast]
[  697.515377]  ? ast_dp_set_enable+0x123/0x140 [ast]
[  697.523227]  drm_atomic_helper_commit_modeset_enables+0x30a/0x470
[  697.532388]  drm_atomic_helper_commit_tail+0x58/0x90
[  697.540400]  ast_mode_config_helper_atomic_commit_tail+0x30/0x40 [ast]
[  697.550009]  commit_tail+0xfe/0x1d0
[  697.556547]  drm_atomic_helper_commit+0x198/0x1c0

This is a cosmetical problem. Enabling the video signal still works
even with the error message. The problem has always been present, but
only recent versions of the ast driver warn about missing the timeout.

Signed-off-by: Thomas Zimmermann <[email protected]>
Fixes: 4e29cc7 ("drm/ast: astdp: Replace ast_dp_set_on_off()")
Cc: Thomas Zimmermann <[email protected]>
Cc: Jocelyn Falempe <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: [email protected]
Cc: <[email protected]> # v6.13+
Reviewed-by: Jocelyn Falempe <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
So use the __le32 type for it.

Fixes: 6202783 ("nvmet: Improve nvmet_alloc_ctrl() interface and implementation")
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
The kato field is little endian on the wire, but native endian in
the in-core structure, add the missing byte swap.

Fixes: 6202783 ("nvmet: Improve nvmet_alloc_ctrl() interface and implementation")
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Fix ISCSI_IBFT Kconfig entry, replace tab with a space character.

Fixes: 138fe4e ("Firmware: add iSCSI iBFT Support")
Signed-off-by: Prasad Pandit <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
…ic()

When performing an iSCSI boot using IPv6, iscsistart still reads the
/sys/firmware/ibft/ethernetX/subnet-mask entry. Since the IPv6 prefix
length is 64, this causes the shift exponent to become negative,
triggering a UBSAN warning. As the concept of a subnet mask does not
apply to IPv6, the value is set to ~0 to suppress the warning message.

Signed-off-by: Chengen Du <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
Add check for the return value of komeda_get_layer_fourcc_list()
to catch the potential exception.

Fixes: 5d51f6c ("drm/komeda: Add writeback support")
Cc: [email protected]
Signed-off-by: Haoxiang Li <[email protected]>
Acked-by: Liviu Dudau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Liviu Dudau <[email protected]>
If the hotplug detect of a display is low for longer than one second
(configurable through drm_dp_cec_unregister_delay), then the CEC adapter
is unregistered since we assume the display was disconnected. If the
HPD went low for less than one second, then we check if the properties
of the CEC adapter have changed, since that indicates that we actually
switch to new hardware and we have to unregister the old CEC device and
register a new one.

Unfortunately, the test for changed properties was written poorly, and
after a new CEC capability was added to the CEC core code the test always
returned true (i.e. the properties had changed).

As a result the CEC device was unregistered and re-registered for every
HPD toggle. If the CEC remote controller integration was also enabled
(CONFIG_MEDIA_CEC_RC was set), then the corresponding input device was
also unregistered and re-registered. As a result the input device in
/sys would keep incrementing its number, e.g.:

/sys/devices/pci0000:00/0000:00:08.1/0000:e7:00.0/rc/rc0/input20

Since short HPD toggles are common, the number could over time get into
the thousands.

While not a serious issue (i.e. nothing crashes), it is not intended
to work that way.

This patch changes the test so that it only checks for the single CEC
capability that can actually change, and it ignores any other
capabilities, so this is now safe as well if new caps are added in
the future.

With the changed test the bit under #ifndef CONFIG_MEDIA_CEC_RC can be
dropped as well, so that's a nice cleanup.

Signed-off-by: Hans Verkuil <[email protected]>
Reported-by: Farblos <[email protected]>
Reviewed-by: Dmitry Baryshkov <[email protected]>
Fixes: 2c6d1ff ("drm: add support for DisplayPort CEC-Tunneling-over-AUX")
Tested-by: Farblos <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Dmitry Baryshkov <[email protected]>
To suppress the compiler "warning: symbol 'nvme_tls_attrs_group' was not
declared. Should it be static?"

Fixes: 1e48b34 ("nvme: split off TLS sysfs attributes into a separate group")
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Do not access the state variable directly, instead use proper
synchronization so not stale data is read.

Fixes: e6e7f7a ("nvme: ensure reset state check ordering")
Signed-off-by: Daniel Wagner <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
queue_limits_cancel_update() must only be called if
queue_limits_start_update() is called first. Remove the
queue_limits_cancel_update() call from linear_set_limits() because
there is no corresponding queue_limits_start_update() call.

This bug was discovered by annotating all mutex operations with clang
thread-safety attributes and by building the kernel with clang and
-Wthread-safety.

Cc: Yu Kuai <[email protected]>
Cc: Coly Li <[email protected]>
Cc: Mike Snitzer <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Fixes: 127186c ("md: reintroduce md-linear")
Signed-off-by: Bart Van Assche <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Song Liu <[email protected]>
My sparc64 defconfig build failed like this:

drivers/block/sunvdc.c: In function 'vdc_queue_drain':
drivers/block/sunvdc.c:1130:9: error: too many arguments to function 'blk_mq_unquiesce_queue'
 1130 |         blk_mq_unquiesce_queue(q, memflags);
      |         ^~~~~~~~~~~~~~~~~~~~~~
In file included from drivers/block/sunvdc.c:10:
include/linux/blk-mq.h:895:6: note: declared here
  895 | void blk_mq_unquiesce_queue(struct request_queue *q);
      |      ^~~~~~~~~~~~~~~~~~~~~~
drivers/block/sunvdc.c:1131:9: error: too few arguments to function 'blk_mq_unfreeze_queue'
 1131 |         blk_mq_unfreeze_queue(q);
      |         ^~~~~~~~~~~~~~~~~~~~~
In file included from drivers/block/sunvdc.c:10:
include/linux/blk-mq.h:914:1: note: declared here
  914 | blk_mq_unfreeze_queue(struct request_queue *q, unsigned int memflags)
      | ^~~~~~~~~~~~~~~~~~~~~

Fixes: 1e1a9ce ("block: force noio scope in blk_mq_freeze_queue")
Cc: Christoph Hellwig <[email protected]>
Cc: Jens Axboe <[email protected]>
Signed-off-by: Stephen Rothwell <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Commit c8c68c3 ("cpufreq: amd-pstate: initialize core precision
boost state") sets per-policy boost flag to false when boost fail.
However, this boost flag will be set to reverse value in
store_local_boost() and cpufreq_boost_trigger_state() in cpufreq.c. This
will cause the per-policy boost flag set to true when fail to set boost.
Remove the extra assignment in amd_pstate_set_boost() and keep all
operations on per-policy boost flag outside of set_boost() to fix this
problem.

Fixes: c8c68c3 ("cpufreq: amd-pstate: initialize core precision boost state")
Signed-off-by: Lifeng Zheng <[email protected]>
Reviewed-by: Mario Limonciello <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mario Limonciello <[email protected]>
Ensure IRQs and IPC are properly disabled if HW sched or DCT
initialization fails.

Fixes: cc3c72c ("accel/ivpu: Refactor failure diagnostics during boot")
Cc: [email protected] # v6.13+
Reviewed-by: Karol Wachowski <[email protected]>
Reviewed-by: Jeffrey Hugo <[email protected]>
Signed-off-by: Jacek Lawrynowicz <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
pm_runtime_resume_and_get() sets dev->power.runtime_error that causes
all subsequent pm_runtime_get_sync() calls to fail.
Clear the runtime_error using pm_runtime_set_suspended(), so the driver
doesn't have to be reloaded to recover when the NPU fails to boot during
runtime resume.

Fixes: 7d4b4c7 ("accel/ivpu: Remove suspend_reschedule_counter")
Cc: [email protected] # v6.11+
Reviewed-by: Maciej Falkowski <[email protected]>
Reviewed-by: Jeffrey Hugo <[email protected]>
Signed-off-by: Jacek Lawrynowicz <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Disable runtime PM for the duration of reset/recovery so it is possible
to set the correct runtime PM state depending on the outcome of the
`ivpu_resume()`. Don’t suspend or reset the HW if the NPU is suspended
when the reset/recovery is requested. Also, move common reset/recovery
code to separate functions for better code readability.

Fixes: 27d1926 ("accel/ivpu: Improve recovery and reset support")
Cc: [email protected] # v6.8+
Reviewed-by: Maciej Falkowski <[email protected]>
Reviewed-by: Jeffrey Hugo <[email protected]>
Signed-off-by: Jacek Lawrynowicz <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
When topology changes, before beginning a new HDCP authentication by
sending AKE_init message we need to first authenticate only the
repeater. Only after repeater authentication failure, it makes sense
to start a new HDCP authentication. Even though it made sense to not
enable HDCP directly from check_link and schedule it for later, repeater
authentication needs to be done immediately.

--v2
-Fix comment grammatical errors [Ankit]

Fixes: 47ef55a ("drm/i915/hdcp: Don't enable HDCP2.2 directly from check_link")
Signed-off-by: Suraj Kandpal <[email protected]>
Reviewed-by: Ankit Nautiyal <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 605a33e)
Signed-off-by: Rodrigo Vivi <[email protected]>
Use intel_encoder_is_hdmi function which was recently introduced to
see if encoder is HDMI or not.

--v2
-Add Fixes tag [Jani]

Fixes: 6a3691c ("drm/i915/hdcp: Disable HDCP Line Rekeying for HDCP2.2 on HDMI")
Signed-off-by: Suraj Kandpal <[email protected]>
Reviewed-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 2499212)
Signed-off-by: Rodrigo Vivi <[email protected]>
When running igt@gem_exec_balancer@individual for multiple iterations,
it is seen that the delta busyness returned by PMU is 0. The issue stems
from a combination of 2 implementation specific details:

1) gt_park is throttling __update_guc_busyness_stats() so that it does
not hog PCI bandwidth for some use cases. (Ref: 59bcdb5)

2) busyness implementation always returns monotonically increasing
counters. (Ref: cf907f6)

If an application queried an engine while it was active,
engine->stats.guc.running is set to true. Following that, if all PM
wakeref's are released, then gt is parked. At this time the throttling
of __update_guc_busyness_stats() may result in a missed update to the
running state of the engine (due to (1) above). This means subsequent
calls to guc_engine_busyness() will think that the engine is still
running and they will keep updating the cached counter (stats->total).
This results in an inflated cached counter.

Later when the application runs a workload and queries for busyness, we
return the cached value since it is larger than the actual value (due to
(2) above)

All subsequent queries will return the same large (inflated) value, so
the application sees a delta busyness of zero.

Fix the issue by resetting the running state of engines each time
intel_guc_busyness_park() is called.

v2: (Rodrigo)
- Use the correct tag in commit message
- Drop the redundant wakeref check in guc_engine_busyness() and update
  commit message

Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13366
Fixes: cf907f6 ("i915/guc: Ensure busyness counter increases motonically")
Signed-off-by: Umesh Nerlige Ramappa <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 431b742e2bfc9f6dd713f261629741980996d001)
Signed-off-by: Rodrigo Vivi <[email protected]>
When converting to folios the cleanup path of shmem_get_pages() was
missed. When a DMA remap fails and the max segment size is greater than
PAGE_SIZE it will attempt to retry the remap with a PAGE_SIZEd segment
size. The cleanup code isn't properly using the folio apis and as a
result isn't handling compound pages correctly.

v2 -> v3:
(Ville) Just use shmem_sg_free_table() as-is in the failure path of
shmem_get_pages(). shmem_sg_free_table() will clear mapping unevictable
but it will be reset when it retries in shmem_sg_alloc_table().

v1 -> v2:
(Ville) Fixed locations where we were not clearing mapping unevictable.

Cc: [email protected]
Cc: Ville Syrjala <[email protected]>
Cc: Vidya Srinivas <[email protected]>
Link: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13487
Link: https://lore.kernel.org/lkml/[email protected]/
Fixes: 0b62af2 ("i915: convert shmem_sg_free_table() to use a folio_batch")
Signed-off-by: Brian Geffon <[email protected]>
Suggested-by: Tomasz Figa <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Jonathan Cavitt <[email protected]>
Tested-by: Vidya Srinivas <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
(cherry picked from commit 9e304a18630875352636ad52a3d2af47c3bde824)
Signed-off-by: Rodrigo Vivi <[email protected]>
I'm seeing underruns with these 64bpp YUV formats on TGL.

The weird details:
- only happens on pipe B/C/D SDR planes, pipe A SDR planes
  seem fine, as do all HDR planes
- somehow CDCLK related, higher CDCLK allows for bigger plane
  with these formats without underruns. With 300MHz CDCLK I
  can only go up to 1200 pixels wide or so, with 650MHz even
  a 3840 pixel wide plane was OK
- ICL and ADL so far appear unaffected

So not really sure what's the deal with this, but bspec does
state "64-bit formats supported only on the HDR planes" so
let's just drop these formats from the SDR planes. We already
disallow 64bpp RGB formats.

Cc: [email protected]
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Juha-Pekka Heikkila <[email protected]>
(cherry picked from commit 35e1aac)
Signed-off-by: Rodrigo Vivi <[email protected]>
Su Hui and others added 28 commits February 7, 2025 10:27
Clang static checker(scan-build) warning:
fs/stat.c:287:21: warning: The left expression of the compound assignment is
an uninitialized value. The computed value will also be garbage.
  287 |                 stat->result_mask |= STATX_MNT_ID_UNIQUE;
      |                 ~~~~~~~~~~~~~~~~~ ^
fs/stat.c:290:21: warning: The left expression of the compound assignment is
an uninitialized value. The computed value will also be garbage.
  290 |                 stat->result_mask |= STATX_MNT_ID;

When vfs_getattr() failed because of security_inode_getattr(), 'stat' is
uninitialized. In this case, there is a harmless garbage problem in
vfs_statx_path(). It's better to return error directly when
vfs_getattr() failed, avoiding garbage value and more clearly.

Signed-off-by: Su Hui <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>
Building with GCC 15 results in build error
fs/vboxsf/super.c:24:54: error: initializer-string for array of ‘unsigned char’ is too long [-Werror=unterminated-string-initialization]
   24 | static const unsigned char VBSF_MOUNT_SIGNATURE[4] = "\000\377\376\375";
      |                                                      ^~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors

Due to GCC having enabled -Werror=unterminated-string-initialization[0]
by default. Separately initializing each array element of
VBSF_MOUNT_SIGNATURE to ensure NUL termination, thus satisfying GCC 15
and fixing the build error.

[0]: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wno-unterminated-string-initialization

Signed-off-by: Brahmajit Das <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
Just like it's normal for unset values to be zero, unset strings should be
empty instead of containing random values.

It seems to be a typical mistake that the mask returned by statmount is not
checked, which can result in various bugs.

With this fix, these bugs are prevented, since it is highly likely that
userspace would just want to turn the missing mask case into an empty
string anyway (most of the recently found cases are of this type).

Link: https://lore.kernel.org/all/CAJfpegsVCPfCn2DpM8iiYSS5DpMsLB8QBUCHecoj6s0Vxf4jzg@mail.gmail.com/
Fixes: 68385d7 ("statmount: simplify string option retrieval")
Fixes: 46eae99 ("add statmount(2) syscall")
Cc: [email protected] # v6.8
Signed-off-by: Miklos Szeredi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Jeff Layton <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
Move the initialization of gl_lockref from gfs2_init_glock_once() to
gfs2_glock_get().  This allows to use lockref_init() there.

Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Andreas Gruenbacher <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>
In qd_alloc(), initialize the lockref count to 1 to cover the common
case.  Compensate for that in gfs2_quota_init() by adjusting the count
back down to 0; this only occurs when mounting the filesystem rw.

Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Andreas Gruenbacher <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>
All users of lockref_init() now initialize the count to 1, so hardcode
that and remove the count argument.

Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Andreas Gruenbacher <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>
Andreas Gruenbacher <[email protected]> says:

Here's an updated version with an additional comment saying that
lockref_init() initializes count to 1.

* patches from https://lore.kernel.org/r/[email protected]:
  lockref: remove count argument of lockref_init
  gfs2: switch to lockref_init(..., 1)
  gfs2: use lockref_init for gl_lockref

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>
The FMODE_NONOTIFY_* bits are a 2-bits mode.  Open coding manipulation
of those bits is risky.  Use an accessor file_set_fsnotify_mode() to
set the mode.

Rename file_set_fsnotify_mode() => file_set_fsnotify_mode_from_watchers()
to make way for the simple accessor name.

Signed-off-by: Amir Goldstein <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>
Prepending security options was made conditional on sb->s_op->show_options,
but security options are independent of sb options.

Fixes: 056d331 ("fs: prepend statmount.mnt_opts string with security_sb_mnt_opts()")
Fixes: f9af549 ("fs: export mount options via statmount()")
Cc: [email protected] # v6.11
Signed-off-by: Miklos Szeredi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>
Most pseudo files are not applicable for fsnotify events at all,
let alone to the new pre-content events.

Disable notifications to all files allocated with alloc_file_pseudo()
and enable legacy inotify events for the specific cases of pipe and
socket, which have known users of inotify events.

Pre-content events are also kept disabled for sockets and pipes.

Fixes: 20bf82a ("mm: don't allow huge faults for files with pre content watches")
Reported-by: Alex Williamson <[email protected]>
Closes: https://lore.kernel.org/linux-fsdevel/[email protected]/
Suggested-by: Linus Torvalds <[email protected]>
Link: https://lore.kernel.org/linux-fsdevel/CAHk-=wi2pThSVY=zhO=ZKxViBj5QCRX-=AS2+rVknQgJnHXDFg@mail.gmail.com/
Tested-by: Alex Williamson <[email protected]>
Signed-off-by: Amir Goldstein <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>
STATMOUNT_MNT_OPTS can actually be missing if there are no options.  This
is a change of behavior since 75ead69 ("fs: don't let statmount return
empty strings").

The other checks shouldn't actually trigger, but add them for correctness
and for easier debugging if the test fails.

Signed-off-by: Miklos Szeredi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>
After introducing pre-content events, we had a regression related to
disabling huge faults on files that should never have pre-content events
enabled.

This happened because the default f_mode of allocated files (0) does
not disable pre-content events.

Pre-content events are disabled in file_set_fsnotify_mode_by_watchers()
but internal files may not get to call this helper.

Initialize f_mode to disable permission and pre-content events for all
files and if needed they will be enabled for the callers of
file_set_fsnotify_mode_by_watchers().

Fixes: 20bf82a ("mm: don't allow huge faults for files with pre content watches")
Reported-by: Alex Williamson <[email protected]>
Closes: https://lore.kernel.org/linux-fsdevel/[email protected]/
Tested-by: Alex Williamson <[email protected]>
Signed-off-by: Amir Goldstein <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>
Amir Goldstein <[email protected]> says:

The two Fix patches have been tested by Alex together and each one
independently.

I also verified that they pass the LTP inoityf/fanotify tests.

* patches from https://lore.kernel.org/r/[email protected]:
  fsnotify: disable pre-content and permission events by default
  fsnotify: disable notification by default for all pseudo files
  fsnotify: use accessor to set FMODE_NONOTIFY_*

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>
Pidfs supports extensible and non-extensible ioctls. The extensible
ioctls need to check for the ioctl number itself not just the ioctl
command otherwise both backward- and forward compatibility are broken.

The pidfs ioctl handler also needs to look at the type of the ioctl
command to guard against cases where "[...] a daemon receives some
random file descriptor from a (potentially less privileged) client and
expects the FD to be of some specific type, it might call ioctl() on
this FD with some type-specific command and expect the call to fail if
the FD is of the wrong type; but due to the missing type check, the
kernel instead performs some action that userspace didn't expect."
(cf. [1]]

Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/CAG48ez2K9A5GwtgqO31u9ZL292we8ZwAA=TJwwEv7wRuJ3j4Lw@mail.gmail.com [1]
Fixes: 8ce3528 ("pidfs: check for valid ioctl commands")
Acked-by: Luca Boccassi <[email protected]>
Reported-by: Jann Horn <[email protected]>
Cc: [email protected] # v6.13; please backport with 8ce3528 ("pidfs: check for valid ioctl commands")
Signed-off-by: Christian Brauner <[email protected]>
This costs a strlen() call when instatianating a symlink.

Preferably it would be hidden behind VFS_WARN_ON (or compatible), but
there is no such facility at the moment. With the facility in place the
call can be patched out in production kernels.

In the meantime, since the cost is being paid unconditionally, use the
result to a fixup the bad caller.

This is not expected to persist in the long run (tm).

Sample splat:
bad length passed for symlink [/tmp/syz-imagegen43743633/file0/file0] (got 131109, expected 37)
[rest of WARN blurp goes here]

Signed-off-by: Mateusz Guzik <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>
Fix a possible memory leak in the power capping subsystem (Joe Hattori).

* pm-powercap:
  powercap: call put_device() on an error path in powercap_register_control_type()
Merge a new ACPI IRQ override quirk for Eluktronics MECH-17 (Gannon
Kolding) and an acpi_data_prop_read() fix making it reflect the OF
counterpart behavior in error cases (Andy Shevchenko).

* acpi-property:
  ACPI: property: Fix return value for nval == 0 in acpi_data_prop_read()

* acpi-resource:
  ACPI: resource: IRQ override for Eluktronics MECH-17
…kernel/git/mdraid/linux into block-6.14

Pull MD fix from Song:

"This patch, by Bart Van Assche, fixes an error handling path for
 md-linear."

* tag 'md-6.14-20250206' of https://git.kernel.org/pub/scm/linux/kernel/git/mdraid/linux:
  md: Fix linear_set_limits()
I need to filter my emails better, switch to [email protected] address
to help with that.

Signed-off-by: Pavel Machek <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
I no longer have any faith left in the kernel development process or
community management approach.

Apple/ARM platform development will continue downstream. If I feel like
sending some patches upstream in the future myself for whatever subtree
I may, or I may not. Anyone who feels like fighting the upstreaming
fight themselves is welcome to do so.

Signed-off-by: Hector Martin <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Pull bcachefs fixes from Kent Overstreet:
 "Nothing major, things continue to be fairly quiet over here.

   - add a SubmittingPatches to clarify that patches submitted for
     bcachefs do, in fact, need to be tested

   - discard path now correctly issues journal flushes when needed, this
     fixes performance issues when the filesystem is nearly full and
     we're bottlenecked on copygc

   - fix a bug that could cause the pending rebalance work accounting to
     be off when devices are being onlined/offlined; users should report
     if they are still seeing this

   - and a few more trivial ones"

* tag 'bcachefs-2025-02-06.2' of git://evilpiepirate.org/bcachefs:
  bcachefs: bch2_bkey_sectors_need_rebalance() now only depends on bch_extent_rebalance
  bcachefs: Fix rcu imbalance in bch2_fs_btree_key_cache_exit()
  bcachefs: Fix discard path journal flushing
  bcachefs: fix deadlock in journal_entry_open()
  bcachefs: fix incorrect pointer check in __bch2_subvolume_delete()
  bcachefs docs: SubmittingPatches.rst
…kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:

 - Fix fsnotify FMODE_NONOTIFY* handling.

   This also disables fsnotify on all pseudo files by default apart from
   very select exceptions. This carries a regression risk so we need to
   watch out and adapt accordingly. However, it is overall a significant
   improvement over the current status quo where every rando file can
   get fsnotify enabled.

 - Cleanup and simplify lockref_init() after recent lockref changes.

 - Fix vboxfs build with gcc-15.

 - Add an assert into inode_set_cached_link() to catch corrupt links.

 - Allow users to also use an empty string check to detect whether a
   given mount option string was empty or not.

 - Fix how security options were appended to statmount()'s ->mnt_opt
   field.

 - Fix statmount() selftests to always check the returned mask.

 - Fix uninitialized value in vfs_statx_path().

 - Fix pidfs_ioctl() sanity checks to guard against ioctl() overloading
   and preserve extensibility.

* tag 'vfs-6.14-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  vfs: sanity check the length passed to inode_set_cached_link()
  pidfs: improve ioctl handling
  fsnotify: disable pre-content and permission events by default
  selftests: always check mask returned by statmount(2)
  fsnotify: disable notification by default for all pseudo files
  fs: fix adding security options to statmount.mnt_opt
  fsnotify: use accessor to set FMODE_NONOTIFY_*
  lockref: remove count argument of lockref_init
  gfs2: switch to lockref_init(..., 1)
  gfs2: use lockref_init for gl_lockref
  statmount: let unset strings be empty
  vboxsf: fix building with GCC 15
  fs/stat.c: avoid harmless garbage value problem in vfs_statx_path()
…linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - fix interrupt support in gpio-pca953x

 - fix configfs attribute locking in gpio-sim

 - limit the visibility of the GPIO_GRGPIO Kconfig symbol to OF systems
   only

 - update MAINTAINERS

* tag 'gpio-fixes-for-v6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  MAINTAINERS: Use my kernel.org address for ACPI GPIO work
  gpio: GPIO_GRGPIO should depend on OF
  gpio: sim: lock hog configfs items if present
  gpio: pca953x: Improve interrupt support
…l/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "These fix three assorted issues, including one recent regression:

   - Add an ACPI IRQ override quirk for Eluktronics MECH-17 to make the
     internal keyboard work (Gannon Kolding)

   - Make acpi_data_prop_read() reflect the OF counterpart behavior in
     error cases (Andy Shevchenko)

   - Remove recently added strict ACPI PRM handler address checks that
     prevented PRM from working on some platforms in the field (Aubrey
     Li)"

* tag 'acpi-6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: PRM: Remove unnecessary strict handler address checks
  ACPI: resource: IRQ override for Eluktronics MECH-17
  ACPI: property: Fix return value for nval == 0 in acpi_data_prop_read()
…git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix a handful of issues in the amd-pstate driver, the airoha
  cpufreq driver build, a (recently added) possible NULL pointer
  dereference in the cpufreq code and a possible memory leak in the
  power capping subsystem:

   - Fix cpufreq_policy reference counting and prevent max_perf from
     going above the current limit in amd-pstate, and drop a redundant
     goto label from it (Dhananjay Ugwekar)

   - Prevent the per-policy boost_enabled flag in amd-pstate from
     getting out of sync with the actual state after boot failures
     (Lifeng Zheng)

   - Fix a recently added possible NULL pointer dereference in the
     cpufreq core (Aboorva Devarajan)

   - Fix a build issue related to CONFIG_OF and COMPILE_TEST
     dependencies in the airoha cpufreq driver (Arnd Bergmann)

   - Fix a possible memory leak in the power capping subsystem (Joe
     Hattori)"

* tag 'pm-6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq/amd-pstate: Fix cpufreq_policy ref counting
  cpufreq: prevent NULL dereference in cpufreq_online()
  cpufreq: airoha: modify CONFIG_OF dependency
  cpufreq/amd-pstate: Fix max_perf updation with schedutil
  cpufreq/amd-pstate: Remove the goto label in amd_pstate_update_limits
  cpufreq/amd-pstate: Fix per-policy boost flag incorrect when fail
  powercap: call put_device() on an error path in powercap_register_control_type()
Pull block fixes from Jens Axboe:

 - MD pull request via Song:
      - fix an error handling path for md-linear

 - NVMe pull request via Keith:
      - Connection fixes for fibre channel transport (Daniel)
      - Endian fixes (Keith, Christoph)
      - Cleanup fix for host memory buffer (Francis)
      - Platform specific power quirks (Georg)
      - Target memory leak (Sagi)
      - Use appropriate controller state accessor (Daniel)

 - Fixup for a regression introduced last week, where sunvdc wasn't
   updated for an API change, causing compilation failures on sparc64.

* tag 'block-6.14-20250207' of git://git.kernel.dk/linux:
  drivers/block/sunvdc.c: update the correct AIP call
  md: Fix linear_set_limits()
  nvme-fc: use ctrl state getter
  nvme: make nvme_tls_attrs_group static
  nvmet: add a missing endianess conversion in nvmet_execute_admin_connect
  nvmet: the result field in nvmet_alloc_ctrl_args is little endian
  nvmet: fix a memory leak in controller identify
  nvme-fc: do not ignore connectivity loss during connecting
  nvme: handle connectivity loss in nvme_set_queue_count
  nvme-fc: go straight to connecting state when initializing
  nvme-pci: Add TUXEDO IBP Gen9 to Samsung sleep quirk
  nvme-pci: Add TUXEDO InfinityFlex to Samsung sleep quirk
  nvme-pci: remove redundant dma frees in hmb
  nvmet: fix rw control endian access
…/scm/linux/kernel/git/konrad/ibft

Pull ibft fixes from Konrad Rzeszutek Wilk:
 "Two tiny fixes to IBFT code: one for Kconfig and another for IPv6"

* tag 'stable/for-linus-6.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/ibft:
  iscsi_ibft: Fix UBSAN shift-out-of-bounds warning in ibft_attr_show_nic()
  firmware: iscsi_ibft: fix ISCSI_IBFT Kconfig entry
…m/kernel

Pull drm fixes from Dave Airlie:
 "Just regular drm fixes, amdgpu, xe and i915 mostly, but a few
  scattered fixes. I think one of the i915 fixes fixes some build combos
  that Guenter was seeing.

  amdgpu:
   - Add new tiling flag for DCC write compress disable
   - Add BO metadata flag for DCC
   - Fix potential out of bounds access in display
   - Seamless boot fix
   - CONFIG_FRAME_WARN fix
   - PSR1 fix

  xe:
   - OA uAPI related fixes
   - Fix SRIOV migration initialization
   - Restore devcoredump to a sane state

  i915:
   - Fix the build error with clamp after WARN_ON on gcc 13.x+
   - HDCP related fixes
   - PMU fix zero delta busyness issue
   - Fix page cleanup on DMA remap failure
   - Drop 64bpp YUV formats from ICL+ SDR planes
   - GuC log related fix
   - DisplayPort related fixes

  ivpu:
   - Fix error handling

  komeda:
   - add return check

  zynqmp:
   - fix locking in DP code

  ast:
   - fix AST DP timeout

  cec:
   - fix broken CEC adapter check"

* tag 'drm-fixes-2025-02-08' of https://gitlab.freedesktop.org/drm/kernel: (29 commits)
  drm/i915/dp: Fix potential infinite loop in 128b/132b SST
  Revert "drm/amd/display: Use HW lock mgr for PSR1"
  drm/amd/display: Respect user's CONFIG_FRAME_WARN more for dml files
  accel/amdxdna: Add MODULE_FIRMWARE() declarations
  drm/i915/dp: Iterate DSC BPP from high to low on all platforms
  drm/xe: Fix and re-enable xe_print_blob_ascii85()
  drm/xe/devcoredump: Move exec queue snapshot to Contexts section
  drm/xe/oa: Set stream->pollin in xe_oa_buffer_check_unlocked
  drm/xe/pf: Fix migration initialization
  drm/xe/oa: Preserve oa_ctrl unused bits
  drm/amd/display: Fix seamless boot sequence
  drm/amd/display: Fix out-of-bound accesses
  drm/amdgpu: add a BO metadata flag to disable write compression for Vulkan
  drm/i915/backlight: Return immediately when scale() finds invalid parameters
  drm/i915/dp: Return min bpc supported by source instead of 0
  drm/i915/dp: fix the Adaptive sync Operation mode for SDP
  drm/i915/guc: Debug print LRC state entries only if the context is pinned
  drm/i915: Drop 64bpp YUV formats from ICL+ SDR planes
  drm/i915: Fix page cleanup on DMA remap failure
  drm/i915/pmu: Fix zero delta busyness issue
  ...
@pull pull bot added the ⤵️ pull label Feb 7, 2025
@pull pull bot merged commit 7ee983c into linux-doc:master Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.