Never-Ending Learning to Read the Web

Tuesday, August 27, 2013, at 1 pm ET/12 noon CT/11 am MT/10 am PT/5 pm GMT

One of the great technical challenges in big data is to construct computer systems that learn continuously over years, from a continuing stream of diverse data, improving their competence at a variety of tasks, and becoming better learners over time.

This webinar describes Carnegie Mellon University's research to build a Never-Ending Language Learner (NELL) that runs 24 hours per day, forever, learning to read the web. Each day NELL extracts (reads) more facts from the web, and integrates these into its growing knowledge base of beliefs. Each day NELL also learns to read better than yesterday, enabling it to go back to the text it read yesterday, and extract more facts, more accurately, today.
 
NELL has been running 24 hours/day for over three years now. The result so far is a collection of 50 million interconnected beliefs (e.g., servedWith(coffee, applePie), isA(applePie, bakedGood)), that NELL is considering at different levels of confidence, along with hundreds of thousands of learned phrasings, morphological features, and web page structures that NELL has learned to use to extract beliefs from the web. Track NELL's progress at http://rtw.ml.cmu.edu.
 
Presenter: Tom M. Mitchell, Carnegie Mellon University
Tom M. Mitchell founded and chairs the Machine Learning Department at Carnegie Mellon University, where he is the E. Fredkin University Professor. His research uses machine learning to develop computers that are learning to read the web, and uses brain imaging to study how the human brain understands what it reads. Mitchell is a member of the U.S. National Academy of Engineering, a Fellow of the American Association for the Advancement of Science (AAAS), and a Fellow of the Association for the Advancement of Artificial Intelligence (AAAI). He believes the field of machine learning will be the fastest growing branch of computer science during the 21st century. Mitchell's web page is http://www.cs.cmu.edu/~tom.
 
Moderator: Yolanda Gil, University of Southern California; SIGART
Yolanda Gil is Director of Knowledge Technologies and Associate Division Director at the Information Sciences Institute of the University of Southern California, and Research Professor in the Computer Science Department. She received her M.S. and Ph. D. degrees in CS from Carnegie Mellon University. Dr. Gil leads a group that conducts research on various aspects of Interactive Knowledge Capture. Her research interests include intelligent user interfaces, knowledge-rich problem solving, and the semantic web. An area of recent interest is collaborative large-scale data analysis through semantic workflows. She recently led the W3C Provenance Group that charted a community standardization effort in this area. Dr. Gil has served in the Advisory Committee of the Computer Science and Engineering Directorate of the National Science Foundation. She is Chair of ACM SIGART, the Association for Computing Machinery's Special Interest Group on Artificial Intelligence. She was elected Fellow of the American Association of Artificial Intelligence (AAAI) in 2012.
 

 If you have previously registered for this event, please login below:
 Email
 LOGIN

Registration is required to attend this event. Please register now.
First Name*
Last Name*
Email*
What is your ACM member status?*Current Member
Past Member
Non-Member
Check this box to receive materials on the benefits of membership. 
You must have Javascript and Cookies enabled to access this webcast. Click here for Help.
 
Please enable Cookies in your browser before registering for the webcast.
 
*Denotes required.
 
REGISTER