Patent Application Titled “Health Insurance Cost Prediction Reporting Via Private Transfer Learning” Published Online (USPTO 20190333155)
2019 NOV 18 (NewsRx) -- By a
The assignee for this patent application is
Reporters obtained the following quote from the background information supplied by the inventors: “The present invention relates generally to the field of computing, and more particularly to the transfer of health insurance cost prediction reporting without violating Health Insurance Portability and Accountability Act (HIPAA) compliance or data ownership or any other data policies of the stakeholders.
“Data policies and regulations may include anonymized data, or the data may be unable to move from the original location. Health insurance cost data may be noisy and may require advanced analytics like machine learning techniques to make future cost predictions with a reasonable amount of accuracy. Features that contribute to the cost of health insurance utilization may exist in a very large feature space with a large quantity of samples to perform pattern analysis and prediction. Health insurance cost historical data may often be limited to a small number of people in a provider’s plan area compared to what may be necessary to perform accurate cost prediction due to, among other factors, company size and coverage area, retention and turnover of customers from job and locality changes.”
In addition to obtaining background information on this patent application, NewsRx editors also obtained the inventors’ summary information for this patent application: “Embodiments of the present invention disclose a method, computer system, and a computer program product for generating and reporting a plurality of health insurance cost predictions via private transfer learning. The present invention may include retrieving a set of source data from at least one private source database, and a set of target data from a private target database. The present invention may then include creating a plurality of source data sets from the retrieved set of source data, and at least one target data set from the retrieved set of target data. The present invention may also include anonymizing the created plurality of source data sets, and at least one created target data set. The present invention may further include, in response to determining that at least one anonymized source training data set and at least one anonymized target training data set is created, generating one or more source learner models based on the anonymized source training data set, and a target learner model based on the anonymized target training data set. The present invention may then include combining the one or more generated source learner models and the generated target learner model to generate a transfer learner. The present invention may further include generating a prediction based on the generated transfer learner, wherein the generated prediction is evaluated for quality.”
The claims supplied by the inventors are:
“1. A method for generating and reporting a plurality of health insurance cost predictions via private transfer learning, the method comprising: retrieving a set of source data from at least one private source database, and a set of target data from a private target database; creating a plurality of source data sets from the retrieved set of source data, and at least one target data set from the retrieved set of target data; anonymizing the created plurality of source data sets, and at least one created target data set; in response to determining that at least one anonymized source training data set, and at least one anonymized target training data set is created, generating one or more source learner models based on the anonymized source training data set, and a target learner model based on the anonymized target training data set; combining the one or more generated source learner models and the generated target learner model to generate a transfer learner; and generating a prediction based on the generated transfer learner, wherein the generated prediction is evaluated for quality.
“2. The method of claim 1, further comprising: generating a report based on the generated prediction for the end user.
“3. The method of claim 1, further comprising: in response to receiving a database location and a plurality of access credentials from the end user, providing a model prediction and a model performance to the end user.
“4. The method of claim 1, further comprising: determining that at least one anonymized target test data set, and at least one anonymized source test data set is created; generating a prediction based on the generated transfer learner based on the at least one determined source test data set and at least one determined target test data set, wherein the generated prediction is evaluated for quality; and generating a report based on the evaluated prediction to the end user.
“5. The method of claim 1, wherein combining the generated one or more source learner models and the generated target learner model to generate the transfer learner, further comprises: aligning the one or more combined source learner models and the combined target learner model based on the features used in each learner model; filtering data with the features absent on the one or more aligned source learner models and the aligned target learner model; learning a set of weights and a set of methods to combine the aligned source learner model and the aligned target learner model based on the filtered data; and generating the transfer learner based on the learned set of weights and learned set of models.
“6. The method of claim 1, wherein creating the plurality of source data sets from the retrieved set of source data, and at least one target data set from the retrieved set of target data, further comprises: cleaning the created plurality of source data sets and at least one created target data set by utilizing a data preparation pipeline; and formatting the cleaned plurality of source data sets and at least one cleaned target data set for predictive modelling.
“7. The method of claim 5, wherein learning the set of weights and the set of methods to combine the one or more aligned source learner models and the aligned target learner model based on the filtered data, further comprises: generating a plurality of source features associated with the one or more aligned source learner models, and a plurality of target features associated with the aligned target learner model; in response to mapping the generated plurality of source features to resemble the generated plurality of target features, generating feature mapping; in response to determining that the anonymized source data set includes a plurality of different characteristics absent from a target population, generating at least one set of summary statistics by utilizing a summary statistics module; generating at least one population shifted data set from at least one set of summary statistics to re-weight the anonymized source data; and generating an updated source learner model based on the generated featuring mapping and at least one generated population shift data set.
“8. The method of claim 7, further comprising: identifying at least one model feature intersection by examining the generated updated source learner model and generated target learner model; generating a plurality of samples from the at least one identified model feature intersection; in response to determining that one or more of the generated plurality of samples include a piece of dropped data, removing the one or more generated plurality of samples including the piece of dropped data; and receiving a plurality of target predictions from the target learner model based on the removed dropped data.
“9. The method of claim 8, further comprising: in response to determining that one or more of the generated plurality of samples include a piece of remaining data, receiving the piece of remaining data into the generated transfer learner; and generating a plurality of transfer predictions from the transfer learner based on the received remaining data.
“10. The method of claim 9, further comprising: combining the generated plurality of target predictions and the generated plurality of transfer predictions; generating a predictive model based on the combined plurality of target predictions and the generated plurality of transfer predictions; and generating a performance evaluation based on the generated predictive model.
“11. A computer system for generating and reporting a plurality of health insurance cost predictions via private transfer learning, comprising: one or more processors, one or more computer-readable memories, one or more computer-readable tangible storage medium, and program instructions stored on at least one of the one or more tangible storage medium for execution by at least one of the one or more processors via at least one of the one or more memories, wherein the computer system is capable of performing a method comprising: retrieving a set of source data from at least one private source database, and a set of target data from a private target database; creating a plurality of source data sets from the retrieved set of source data, and at least one target data set from the retrieved set of target data; anonymizing the created plurality of source data sets, and at least one created target data set; in response to determining that at least one anonymized source training data set, and at least one anonymized target training data set is created, generating one or more source learner models based on the anonymized source training data set, and a target learner model based on the anonymized target training data set; combining the one or more generated source learner models and the generated target learner model to generate a transfer learner; and generating a prediction based on the generated transfer learner, wherein the generated prediction is evaluated for quality.
“12. The computer system of claim 11, further comprising: generating a report based on the generated prediction for the end user.
“13. The computer system of claim 11, further comprising: in response to receiving a database location and a plurality of access credentials from the end user, providing a model prediction and a model performance to the end user.
“14. The computer system of claim 11, wherein combining the generated one or more source learner models and the generated target learner model to generate the transfer learner, further comprises: aligning the one or more combined source learner models and the combined target learner model based on the features used in each learner model; filtering data with the features absent on the one or more aligned source learner models and the aligned target learner model; learning a set of weights and a set of methods to combine the aligned source learner model and the aligned target learner model based on the filtered data; and generating the transfer learner based on the learned set of weights and learned set of models.
“15. The computer system of claim 11, wherein creating the plurality of source data sets from the retrieved set of source data, and at least one target data set from the retrieved set of target data, further comprises: cleaning the created plurality of source data sets and at least one created target data set by utilizing a data preparation pipeline; and formatting the cleaned plurality of source data sets and at least one cleaned target data set for predictive modelling.
“16. The computer system of claim 14, wherein learning the set of weights and the set of methods to combine the one or more aligned source learner models and the aligned target learner model based on the filtered data, further comprises: generating a plurality of source features associated with the one or more aligned source learner models, and a plurality of target features associated with the aligned target learner model; in response to mapping the generated plurality of source features to resemble the generated plurality of target features, generating feature mapping; in response to determining that the anonymized source data set includes a plurality of different characteristics absent from a target population, generating at least one set of summary statistics by utilizing a summary statistics module; generating at least one population shifted data set from at least one set of summary statistics to re-weight the anonymized source data; and generating an updated source learner model based on the generated featuring mapping and at least one generated population shift data set.
“17. The computer system of claim 16, further comprising: identifying at least one model feature intersection by examining the generated updated source learner model and generated target learner model; generating a plurality of samples from the at least one identified model feature intersection; in response to determining that one or more of the generated plurality of samples include a piece of dropped data, removing the one or more generated plurality of samples including the piece of dropped data; and receiving a plurality of target predictions from the target learner model based on the removed dropped data.
“18. The computer system of claim 17, further comprising: in response to determining that one or more of the generated plurality of samples include a piece of remaining data, receiving the piece of remaining data into the generated transfer learner; and generating a plurality of transfer predictions from the transfer learner based on the received remaining data.
“19. The computer system of claim 18, further comprising: combining the generated plurality of target predictions and the generated plurality of transfer predictions; generating a predictive model based on the combined plurality of target predictions and the generated plurality of transfer predictions; and generating a performance evaluation based on the generated predictive model.
“20. A computer program product for generating and reporting a plurality of health insurance cost predictions via private transfer learning, comprising: one or more computer-readable storage media and program instructions stored on at least one of the one or more tangible storage media, the program instructions executable by a processor to cause the processor to perform a method comprising: retrieving a set of source data from at least one private source database, and a set of target data from a private target database; creating a plurality of source data sets from the retrieved set of source data, and at least one target data set from the retrieved set of target data; anonymizing the created plurality of source data sets, and at least one created target data set; in response to determining that at least one anonymized source training data set, and at least one anonymized target training data set is created, generating one or more source learner models based on the anonymized source training data set, and a target learner model based on the anonymized target training data set; combining the one or more generated source learner models and the generated target learner model to generate a transfer learner; and generating a prediction based on the generated transfer learner, wherein the generated prediction is evaluated for quality.”
For more information, see this patent application:
(Our reports deliver fact-based news of research and discoveries from around the world.)
Mountain-Pacific secured $15.5 million in federal funding for operations
Technical Mapping Advisory Council
Advisor News
Annuity News
Health/Employee Benefits News
Life Insurance News