From: Michael Brunner Date: Wed, 26 Aug 2009 21:29:25 +0000 (-0700) Subject: thermal_sys: check get_temp return value X-Git-Tag: v2.6.31-rc8~19 X-Git-Url: https://openfabrics.org/gitweb/?a=commitdiff_plain;h=0d288162f2afc42b37aab656f4622c076babbca3;p=~emulex%2Finfiniband.git thermal_sys: check get_temp return value The return value of the get_temp function is not checked when doing a thermal zone update. This may lead to a critical shutdown if get_temp fails and the content of the temp variable is incorrectly set higher than the critical trip point. This has been observed on a system with incorrect ACPI implementation where the corresponding methods were not serialized and therefore sometimes triggered ACPI errors (AE_ALREADY_EXISTS). The following critical shutdowns indicated a temperature of 2097 C, which was obviously wrong. The patch adds a return value check that jumps over all trip point evaluations printing a warning if get_temp fails. The trip points are evaluated again on the next polling interval with successful get_temp execution. Signed-off-by: Michael Brunner Acked-by: Zhang Rui Cc: Len Brown Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c index 0a69672097a..4e83c297ec9 100644 --- a/drivers/thermal/thermal_sys.c +++ b/drivers/thermal/thermal_sys.c @@ -953,7 +953,12 @@ void thermal_zone_device_update(struct thermal_zone_device *tz) mutex_lock(&tz->lock); - tz->ops->get_temp(tz, &temp); + if (tz->ops->get_temp(tz, &temp)) { + /* get_temp failed - retry it later */ + printk(KERN_WARNING PREFIX "failed to read out thermal zone " + "%d\n", tz->id); + goto leave; + } for (count = 0; count < tz->trips; count++) { tz->ops->get_trip_type(tz, count, &trip_type); @@ -1005,6 +1010,8 @@ void thermal_zone_device_update(struct thermal_zone_device *tz) THERMAL_TRIPS_NONE); tz->last_temperature = temp; + + leave: if (tz->passive) thermal_zone_device_set_polling(tz, tz->passive_delay); else if (tz->polling_delay)