Insurance companies enter, maintain and manage tens of thousands of claims annually. The study examined approaches for efficient assignment of each claim using a computer approach with one and two-digit "event code" categories developed by the U.S. Bureau of Labor Statistics.
"So now we are trying to take these vast sets of data, which have been limited in their utility due to the large expense in hiring manual coders, and we are able to glean important information from the injury narratives and come up with new knowledge on the potential causes and prevention of injuries," Lehto said.
The new models might lead to programs that automatically code reports as they are being filed.
"These models can be easily updated to deal with new types of accidents they haven't encountered before," Lehto said.
The models calculated the probability that reports would be classified by human coders in specific categories. One model, called "naive," reviewed individual words, and the other, called "fuzzy," looked at sequences of words and phrases in the narratives, such as "fell off a ladder."
The researchers used a database of 14,000 claim cases, with 11,000 used to develop the models and 3,000 used to test the models.
"It's important to distinguish that we predicted 3,000 cases that were different than the ones used to develop the models," Lehto said. "These were cases the models hadn't seen before, and the models accurately predicted how these cases would be classified by human coders."
|Contact: Emil Venere|