By Anmol Rajpurohit, Mar 19, 2014.
leads LinkedIn's efforts around query understanding. Before that, he led LinkedIn's product data science team. He previously led a local search quality team at Google and was a founding employee of Endeca (acquired by Oracle in 2011). He has written a textbook on faceted search, and is a recognized advocate of human-computer interaction and information retrieval (HCIR). He has a PhD in Computer Science from CMU, as well as BS and MS degrees from MIT.
Here is my interview with him:
Anmol Rajpurohit: 1. Delivering high quality search results from users' unstructured search queries is no trivial task. Can you describe the major challenges in improving search quality?
Actually, I'm happy to refer readers to a recent presentation delivered by my colleagues Abhimanyu Lad and Satya Kanduri on "Search Quality at LinkedIn"
. Basically, our challenges in query understanding are detecting and correcting misspelled queries, segmenting and tagging queries to identify the precise entities of interest to the searcher, identifying the vertical domains most likely to serve the searcher's need, and expanding the query to increase recall. On the ranking side, our biggest challenge comes from having to combine a machine-learned ranking approach with a high degree of personalization.
AR: 2. As the amount of data available - online as well as offline - keeps increasing exponentially, Information Retrieval (IR) is attaining unprecedented importance. Looking at the current trends in the IR research, which trends do you consider the most significant in short-term (next 2-3 years)?
Not surprisingly, I believe that query understanding will play an increasingly important role. I believe the current state of query understanding is significantly less mature than that of ranking, and that we'll see most investment in query interpretation and elaboration — especially search-assist interfaces.
AR: 3. I must admit that "Head of Query Understanding" seems to be a very interesting and innovative designation that I had never heard of before seeing it on your LinkedIn profile. What are the roles and responsibilities of "Head of Query Understanding" at LinkedIn? How is this role different from your past designation "Director, Data Science" at LinkedIn?
I created the team, but you'll find from a quick search on LinkedIn (naturally!) that other companies do have individuals and teams that focus on query understanding.
The goal of query understanding is to work with users to establish their query intent.
Query understanding mostly takes place before the search engine retrieves any results -- it focuses on analyzing and possibly rewriting the query, as well as assisting the query elaboration process through auto-completion and suggested searches.
My previous role as Director of Data Science was broader but less focused. Transferring into my current role has allowed me to focus on search, which has been my driving passion for most of my professional career.
AR: 4. Given your expertise on LinkedIn search, I have to ask you this question. What steps should LinkedIn users take so that their profile prominently shows up in the search results for relevant keywords (and thus, help towards advancing their careers)?
If I told you, I'd have to kill you! :-)
No, really, there are many things you can do to improve your findability
without crossing the line into abusive search engine optimization. Here are my top four recommendations:
AR: 5. You have had a long and outstanding career in Data Science. Looking back, had you ever expected Computer Science and Mathematics to be a killer academic background? What advice would you give to current students aspiring a long career in Data Science?
- Complete every profile field. Make sure you can be found by your current position, past positions, the school you attended, what you studied there, etc. Profile completeness not only helps you show up in more searches, but also improves how you are matched in our various recommender systems.
- Add your skills. Skills are among the most frequent queries performed by recruiters and hiring managers.
- Connect to all the people with whom you have a professional relationship. For many kinds of searches, having a stronger network connection to the searcher improves your ranking.
- Use standard job titles. Some people like to have fun job titles like "Chief Janitor" or "Head of Query Understanding". While these titles may favorably communicate your personality, they aren't great for your findability. Standard job titles may be boring, but they are what people search for.
I pursued computer science and math because I loved them -- my original aspiration was to study combinatorics and work in academia. Fortunately, I discovered the world of practical industry applications, and I've never looked back.
My advice to aspiring data scientists is to learn current technical skills (e.g., languages like Python and Scala; frameworks like Apache Spark) but even more importantly to learn what kinds of problems you like. I find that the best data scientists combine analytical and technical ability with a strong grounding in at least one of the social science, especially economics or sociology.
AR: 6. On a personal note, what book (or article) did you read recently and would strongly recommend?
Perhaps not what you have in mind, but I love everything that Neil Gaiman has written, and I recently re-read American Gods
. I also enjoyed Ricardo Semler's Maverick
, an autobiographical book about how he introduced participative management (what people are now calling "holacracy") in a Brazilian manufacturing company that produces $200M in annual revenue.