Faculty Spotlight: Professor Bei Yu

Interviewee: Professor Bei Yu

Tell us about yourself:

My name is Bei Yu. I am a Katchmar-Willhelm endowed associate professor at the iSchool. I joined the iSchool in 2009. I will be teaching Data Mining for the Information Management online program.

What is your academic background?

I am a computer and information scientist by trade. I earned my bachelor’s and master’s degrees in computer science, and then my PhD in library and information science from the University of Illinois at Urbana-Champaign (UIUC). I also did a three-year post-doctorate training at the Kellogg School of Management at Northwestern University.

Why did you pursue your PhD in Library and Information Science from Northwestern University?

After finishing my bachelor’s and master’s degrees in computer science, I felt I had a lot of interaction with computers but I did not know much about my fellow human beings in terms of how they interact with the computer systems that I developed. That was also in 1999 when the internet and search engines started to transform our communication and information access. I became interested in human-computer interaction and information retrieval. These are interdisciplinary areas intersecting CS and LIS, so I applied to PhD programs in both CS and LIS. I chose the UIUC iSchool for my PhD study, where I fell in love with NLP and machine learning, and began to work on how to use these technologies to help people find, organize and assess information.

What skills do you think you have gained that have successfully helped you in your professional life?

Math, programming, critical thinking, and communication skills are super helpful for my career. I need math and programming to design and implement algorithms, and understand algorithms designed by others. Critical thinking is crucial for evaluating the prediction models, such as what patterns they have learned, and whether they are what we expected. Communication skills are vital for talking to people with or without technical background. As a data scientist, no matter how exciting the patterns you have found in data, you’d have to tell that data story that makes sense to both experts and lay persons. 

Based on your professional background, how have you seen information and data management in an organization change over the past years?

Over the past years, organizations have been paying more attention to big data and predictive analysis, in order to increase the value of data for their businesses. With cloud computing and the Internet of Things, we will see all kinds of companies accumulate large amounts of data and look for talents to analyze them. 

How does your industry experience impact your teaching?

So far I haven’t worked in industry. I’m mostly academic and sometimes do consulting for companies. I’ve collaborated with researchers and practitioners in many areas, like humanist scholars, hospitality researchers, political scientists, linguists, and management science scholars. I have accumulated a lot of stories about real-world problems in these areas and how data mining can help. Students have told me that these stories helped them understand how to apply the data mining techniques that they learned in class.

What issues related to information and/or data interest you most?

I’m interested in all aspects of issues that are relevant to building reliable and useful prediction models. These issues include but may not limit to: data quality issues, domain knowledge hidden in the data, the strength and weakness of different algorithms based on computational learning theory, reliable evaluation methods, and reproducible research.

What sets the iSchool@Syracuse apart from other schools?

We emphasize experiential learning and care about how students can apply what they learned into real-world applications. I think this is crucial for professional training in information management. Our staff team is also top-notch in supporting faculty teaching and student learning.

What are you most looking forward to with this new cohort?

I look forward to curiosity in data and fun discussions on data from all sorts of areas, and innovative ideas on where and how to apply data mining techniques to solve real-world problems, be it business, government, education, or health care.  In my past online courses, I’ve had students with work, family, and many things to juggle in life. I’ve also had soldiers. I’m excited that with online learning techniques, we can make learning happens no matter where we are and how busy we are.

What are your current research interests?

I am particularly interested in sentiment analysis and opinion mining. My past work includes algorithmic reading of Congressional speeches in the past 20 years, identifying linguistic cues for political opinion expression, and tracking the political ideology shifts manifested in the topic and opinion change in the speeches. This was some of the earliest work in what is now called computational social science. Our method was later adopted by researchers in other countries to study language and politics in other political systems.

I’m currently building a citation opinion retrieval and analysis tool (CORA), which aims to automatically identify a citer’s opinion toward cited work in academic publications, and thus help summarize citation opinions for researchers overwhelmed by the huge amount of academic literature. I’m also working with librarians at the SU library to use CORA to help student learn information literacy and critical thinking.