Data
mining to determine customer satisfaction over time
Proposer/liaison
Eleanor M. Feit, GM
North American Product Development.
Each year, General Motors surveys thousands of new
vehicle owners asking them to rate their satisfaction
with their new vehicles. In addition to rating their
overall satisfaction survey respondents rate specific
aspects of their vehicle such as
“Braking performance under normal conditions”
or “Quietness inside the vehicle.” Once
we get the survey responses,
we typically take average survey ratings for each
vehicle to determine whether GM’s customers
are more or less satisfied than customers of competitive
vehicles. The data collected from these surveys is
archived and is readily available going back to 1997.
We would
like to mine this data to identify trends in customer
satisfaction over time. We would like to be able to
answer the question “How satisfied do we expect
customers of Lexus RX300 to be two years from now?
Based on past trends, will customers be more satisfied
than they are now?” If we could answer this
question, we would be in a much better position to
know how good GM’s luxury sport utility vehicle
will need to perform to compete with the RX300.
We would
also like to identify significant “step function”
improvements in satisfaction that occurred on competitive
vehicles
in the past. For instance, if we were able to identify
that satisfaction with braking on the Civic significantly
improved between model years 1999 and 2000, then we
could ask engineers to review design changes that
occurred in those years and try to identify what about
the design made it more satisfying to the customer.
While
both of these problems may sound simple at the onset,
data analysis of the automotive market is always problematical
due to the tremendous complexity of our products.
While there are just over 300 nameplates (e.g. Pontiac
Vibe) in the US market, there are numerous variations
of each nameplate, with different engines, transmissions,
and other features that could significantly impact
customer satisfaction. Sometimes the differences between
two variants of the same nameplate are more significant
than the differences between two competitive nameplates.
This analysis
is also potentially impacted by demographic skews
that may exist in the survey data. For example, we
know that men and women who purchase the same vehicle
may have a tendency to rate the vehicle differently.
Women who purchased the PT Cruiser in the second quarter
of 2002 on average rated it 3.03 on a scale of 0-4,
where men rated it 2.71. Given that men and women
who purchase the same car rate it differently, maybe
we should adjust the average ratings that we get for
vehicles that are purchased predominantly by men or
by women. There may be a need for other adjustments
based on age or other demographics.
Because
the survey data is proprietary, we will supply the
team with a masked version. For instance, we may not
reveal the names of the specific satisfaction areas.
We will also make some hard performance data available
to the team, such as acceleration times.
The ideally
completed project deliverable would be a forecasting
model used to predict trends in customer satisfaction
and a method for sorting through large number of vehicles
to identify significant changes in customer satisfaction
over time. GM uses Excel, Access and Minitab as our
primary data analysis software, so any delivered analysis
tools should be compatible
with that software.
(This summary
was prepared by Eleanor M. Feit.)
Top
of Page