Those of you familiar with the SpringSense story will be aware that until recently, SpringSense held the title of the world’s most accurate noun-sense disambiguator, and delivered this accuracy in real-time.
In the July/August 2012 issue of IEEE Intelligent System, an academic paper from P. Chen, C. Bowes (Uni. of Houston) and W. Ding, M. Choly (Uni. of Massachusetts, Boston) described a technique that was able to surpass the accuracy of version 1.0 of our technology. Whilst unable to perform the task quickly enough to be useful to enterprise, we still didn’t like the idea of being number 2, so we set about reclaiming the title.
We still didn’t like the idea of being number two, so we set about reclaiming the title
The mission we charged our chief scientist with, was to do whatever it took to regain top position; Fred Rotbart, PhD, was told nothing was off-limits. In our early sessions as we investigated the possibilities, it quickly became clear that the basis of our existing approach was valid as it offered us real-time speed, something we couldn’t sacrifice if we wanted a solution that could be used by our customers in enterprise.
A way forward presented itself; our innovative approach to NLP was valid, but we needed to re-visit our implementation. Dr Rotbart proceeded to pull apart our algorithm, and put it back together well oiled and with less cruft, which edged us closer to our goal. Our big breakthrough though, came after a moment of insight from Dr Rotbart, leading him to find an alternative and more accurate way of using the results of our data-mining algorithm to perform the noun-sense disambiguation.
The result was an increase in accuracy to a world leading 83.4%, as measured by the industry and academic benchmark SemEval 4 (task 7), without sacrificing any of the performance that allows SpringSense to be used for high volume transactional usage, such as Big Data and enterprise search.
Being overtaken by a more accurate solution was a useful learning experience for us as a team. What we learned from the journey was that to be useful to our customers, a solution needs to work in real-time without sacrificing accuracy. Our mission here in the SpringSense team remains to lead the world in both speed and accuracy.
The new version with the accuracy improvements is already live on the Mashape API Hub, we’d love for you to try it out and give us your feedback; a free plan is offered for your evaluation. Bindings are available for Ruby, Python, Java and ElasticSearch and more.