Jump to content

How to do classification with high predictive value?


dchalmer

Recommended Posts

I have a two-class classification problem. I would like to train a multivariate classifier with 100% positive predictive value. In other words, I want the model to completely avoid one of the classes. For this application a low-ish sensitivity is OK as long as PPV is ~100%.
Do you have any suggestions of good techniques to use?

Thank you!

Link to comment
Share on other sites

  • 2 weeks later...

I have a two-class classification problem. I would like to train a multivariate classifier with 100% positive predictive value. In other words, I want the model to completely avoid one of the classes. For this application a low-ish sensitivity is OK as long as PPV is ~100%.

Do you have any suggestions of good techniques to use?

Thank you!

 

 

I know an article that did precisely what you ask. 

 

Dr. Harvey wanted to predict deep vein thrombosis (DVT) using d-dimer (DD) as an attribute, but he sought 100% correct prediction of all positive DVD cases.  So he ordered observations by their d-d value, and located the d-d value beneath which there were no observations positive for DVD.  He then assessed the statistical significance and effect strength of the resulting model using an exact non-parametric methodology known as “optimal data analysis” (ODA).  The ODA statistical paradigm explicitly maximizes model accuracy (rather than variance or value of the likelihood function).  Dr. Harvey’s article was selected by the American College of Physicians Journal Club, and read by most practicing internists in the US.

 

Here is the link to the article citation: http://stroke.ahajournals.org/content/27/9/1516.long

 

Here is the citation to the seminal introduction to the ODA paradigm, which comes with software for Windows: http://www.amazon.com/Optimal-Data-Analysis-Guidebook-Software/dp/1557989818

 

And here is a blog about ODA: http://odajournal.wordpress.com/

 

BTW, regardless of the analytic geometry (the structure of the data), ODA always identifies the model that explicitly maximizes model classification accuracy

Edited by Planet
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

This website uses cookies to ensure you get the best experience on our website. See our Privacy Policy and Terms of Use