IBM’s Anant Jhingran – CTO & VP Information Management Group – Podcast Transcript

To Listen to the Podcast go to PodTech.net  

Guest: Anant Jhingran – IBM CTO and VP Information Managment Group
 Host:  John Furrier – Founder of PodTech.net

John Furrier – Founder of PodTech.net
Welcome to the PodTech.net InfoTalk series.  We are here with Anant Jhingran the CTO and VP of IBM’s Information Management Group in San Jose.  Welcome to the podcast.

Anant Jhingran – IBM Thank you very much John.  Thank you for having me.

John Furrier – Founder of PodTech.net
You are a Senior Executive at IBM.  You’ve been a distinguished engineer and you’ve grown through the ranks.  You are leading a group of people within IBM around search, knowledge management, and a bunch of cutting edge technologies.  I’m excited to chat with you and talk about some of the things that you see happening in the search world and in the enterprise just in general.  Talk about search knowledge management, Web 2.0; what are your views?
Anant Jhingran – IBM

In general I think that the next frontier in enterprise is definitely about tapping all the information that is hidden beneath application silos or on people’s desktops or departmental servers.  Information that is used for transactions has very much been tapped and people have built siloed applications incorporating data that is needed for their payroll processing for their audit processing and for their procurement processes and others.  There is a wave of next generation applications that are being built which are combining these siloed apps. and are called composite apps. That is one wave that is happening in the enterprise space.  The second wave that is happening is around deeper knowledge workers information finding tasks.  John, you and I spend far more time looking for information than just dealing with relations tables and structured artifacts; whereas a lot of information management business has been built on structured artifacts and therefore the next wave is really going to happen on being able to tap the right information for knowledge tasks.  That is not just about search, because search in some ways (as you point out like Web 1.0) you give a set of key words and you actually then determine what the answer is.  In reality, what people are trying to do is accomplish a task.  Really, you have to go from “do what I say� paradigm of search to really understand the context and move to “do what I mean.�  So understand the context of what the knowledge worker is trying to do, assemble the right information for that knowledge worker, assemble it from sources internal and external, and then combine them to help solve that knowledge task.

John Furrier – Founder of PodTech.net
This is great stuff, but I just want to stop you for a minute and drill down into what we are talking about here.  You mentioned Web 1.0, Web 2.0 is here and a lot of people are talking about it, but it is mostly consumer.  You have a great view into what Web 2.0 means for the enterprise, but you also talked about search.  What is the Web 2.0 search environment?  You talked to me before the podcast, IBM has a huge experience in databases going back to DB2 and historically, there is a deep relationship you guys have with search.  What are the Web 2.0 environment for search both for consumer and enterprise and the unstructured environment that we live in, how does that play into this?

Anant Jhingran – IBM

Two aspects of this.  What has really happened is that at some level, Web 1.0 was formed around html and text.  People built up text processing and statistical techniques which did not try to deeply pass the meaning of the text, but was much more about some statistical aggregation techniques and others.  Which is great because there is a huge amount of information on the web that you can actually go and tap just at a purely statistical level.  But if you really want to conduct transactions or you want to generate high levels of meanings and values, then you’ve got to understand the concepts that are hidden behind the text and you’ve got to be able to relate those concepts with the right sets of APIs and the right set of composition tools so that you can actually build high level concepts as opposed to just reading it as text.  So, that is the next wave.  What happens is that the enterprise wave around the context is inherently different than the context wave on the outside.  So, what happens in the enterprise is that there is a concept of a customer.  But the concept of a customer per say is deeply buried.  Not just in textual documents, but is deeply buried behind SEP applications, behind cakes and cobal applications, and others.  And really the next wave of Web 2.0 is all about being able to combine all those dispersed sources of information that sit within an enterprise so that can manifest that concept of a customer out of all those dispersed sources and now you can relate a customer with another customer, a customer with a branch office, and therefore you can form that Web 2.0 kind of relationship.  But in order to assemble those objects it’s a different task than what is happening on the outside web.

John Furrier – Founder of PodTech.net
It’s an enterprise mash-up. 

Anant Jhingran – IBM

It very much is an enterprise mash-up.  I wish things were as simple as API because what happens in this is that we’ve got kind of a clean sheet of paper in the mid 1990s to define a new world on the web.  Unfortunately on the enterprise side, there is more than some legacy if I may say so, right?  And really therefore things are more complex. 

John Furrier – Founder of PodTech.net
That is changing, how do you see that?  Obviously there is legacy, but also the new world is shifting to an unstructured environment, the consumer web is penetrating the enterprise.  So, how is that blending in and all this unstructured data is coming into, in essence, a structured environment?

Anant Jhingran – IBM

What is happening is that if I look at it from just pure information management perspective, which is kind of a broad area that we talk about at IBM, that leads with data and other things.  It’s a big industry.  It’s a big industry that basically took off when people standardized around SQL.  And it’s a huge industry, multiple tens of billions of dollars.  But the industry is mostly focused around structured data in the enterprise.  So, people have built solutions which are looking after being able to solve business problems around that data.  And, well established practices have been built up.  People have established what are the best practices, what should we do?  There are debates among people about whether they should follow the Kimball school of warehousing or whether they should follow the invent school of warehousing, which is all good healthy debates, but it is a well established school.  On the unstructured side, people are realizing that it is much more like the wild, wild west within the enterprise.  So, two things are happening on the unstructured side.  One is people are beginning to build the same sets of best practices on the unstructured side as have been built up on the structured side.  So, what’s a concept of management which is equal into warehousing?  What does data movement like ETL mean?  What’s the concept of mining in text analysis?  What’s the concept of navigation of this?  So, there is one set of things which is really establishing best practices in all unstructured data.  But I think there is an even bigger wave that is happening which is really bringing structured data and unstructured data together to solve the consumer problems.  And, I think that in both of these cases, enterprise customers would expect great things out of enterprise data, but the user expectations will be set by what’s the experience of the outside.  So, Google for example, has spoiled the people by…so that you type very little.  You type in one word or 1.2 words in average, and if the right answer is not there in the first three or four web links, then you are annoyed.  Now, think about all the signs that need to go from one or two words of user typing into really understanding and being able to present to the user.  So, I think this expectation from the consumer experience is an expectation that people want on their enterprise site.  They wanted to enterprise data which is much more difficult to corral because it doesn’t have the same social networking characteristics and the same sets of hints that the external data has.

John Furrier – Founder of PodTech.net
Talk about the hints because that is an important topic.  So, Google does that.  They are doing it because they have algorithmic search, but it is because as a baseline the way they catalog the data they have a “clean sheet of paper.�  What you are saying, is that on the enterprise site with a legacy, it is so much harder.  Is it different databases, different syntax?  Why is it harder on the enterprise to do a two word search and be as relevant as Google?

Anant Jhingran – IBM

Two reasons for that. One is that a great benefit for Web Search is the so called social linking.  The page algorithms of Google and others talk of…work really well because if something is good, other people point to it. So instead of just stuffing a web page with the right sets of key words, right.  Which the first generation of web search engines brought about.  You could use peer reviews as a way of determining whether this is a good page or not a good page.  What happens is that on the enterprise side such peer reviews don’t exactly happen.  Not that they shouldn’t happen, but enterprise information architecture have typically not evolved in the same democratic or federalist kind of way that the web structure evolved.  That I think is the first main difference that we talk about.  The second, as you mentioned, John, is really that there is a huge amount of information that is buried in what are not html documents.  In fact today, most of the businesses are being run on structured relation artifacts.  You cannot and just turn around and we’re ignore that.  Now we’re going to do all of this on purely unstructured email and web documents and others.  That’s like throwing the baby out with the bath water.  You’ve got to bring that thing into play also.  The techniques that work for structured data in order to do these kinds of ranking and ratings and others are very very different than the techniques that work for unstructured data. 

John Furrier – Founder of PodTech.net
So that’s really the legacy thing.  So going forward where does enterprise search go from here?

Anant Jhingran – IBM

I think that enterprise search itself will go in two directions.  One is: it will become, as I call it, one of the “-ities.� Can we use this word “=ities.� As with IBM, nobody buys stuff from IBM without it having the right quality, avalibility, serviceability, scalability, and other things.  Things that are like air, that people just expect.   So they buy a quantum management system from IBM but they expect these “-ities� to be there.

John Furrier – Founder of PodTech.net
Like reliability etc, etc.

Anant Jhingran – IBM

I wouldn’t go there, but that is exactly right.  Search in a way will become a fabric of the next generation of middleware in the same way that nobody writes any application today without depending on a back end data base.  All of the PHP apps. are run for example with, in some cases, My SQL backing in some cases other relation backing, all of enterprise apps. do that.  So one direction of search really will be that it will become one of the fabrics of the middleware that will be embedded every where.  Just like relation data bases are embedded.  And the second place where search will go, as I hinted in the past in my early statements, it will really move from “do what I say�  to “do what I mean.�  Which means it will really get to information finding, in the context of the task. And the fact that there is a search happening in the back is going to more or less irrelevant because what people are really interested in is being to able to determine business intelligence, intelligence and the right information.  So just to give you an example, on the structured side I can do “which of my customers are poor customers?� It’s simple.  The way this is done is you define poor to be customers that generate less than “N� dollars for you per year. You sort the customers by the amount of dollars they generate and you pick the ones that generate less than the “N� for you and you get the answer.  If you wanted to do the same kind of analyses.  Now you say, “That’s great.�  But now I say, “Tell me about my “N� costumers who are poor customers for me but are really being written well about or are well recognized or on the path for growth.�  Now you have a huge amount of information about these customers.  Now I want to combine it with this particular thing.  And you run into a brick wall.  You run into a brick wall because the concept of a customer on this unstructured side in not written.  You have to analyses the documents and everything else.  And no one real key word query that you can get that would be able to bring that information to you.   You have to combine this. That’s were you shift from being pure search as a key word search to really information finding.  You got many of IBM customers are doing this in context of call center records for example.  You and I have talked about it in the past, that call centers have structured information about who called and when they called and they have a lot of things that are recorded about, or even the call recorded. In order to really be able to analysis what is the root cause you cannot just depend on the structured information, you cannot type key words. You really have to extract concepts out of unstructured data and relate them.  To cut the long story short, really search will go in two directions.  One is it will become an “-ity� and it’s probably a phrase that will get quoted a lot.  It will move to a hard level plain, with respect to the “Do what I mean,� as opposed to “Do what I say.�

John Furrier – Founder of PodTech.net
It being a fabric, so now we’re going to have an “s� on lamp so lamps.

Anant Jhingran – IBM

Exactly.  That is exactly right.

John Furrier – Founder of PodTech.net
Or whatever stack people are using.  We’re here with the Anant Jhingran from IBM.  Final questions, quick prediction on enterprise search.  Where is it going to be in five years?  And will it be fully integrated?  And what does that mean?

Anant Jhingran – IBM

So the answer is absolutely yes.  I think that the rate and pace of the growth of enterprise information finding in our unstructured data, we call it discovery as apposed to search, is much faster than the rate and pace of growth of the penetration of operational morals early on.  Therefore, what took fifteen years for relational models to do, will probable take five for enterprise discovery?  Exactly right.  The second phenomenon which is that enriches the consumer phenomenon.  The consumer phenomenon, web phenomenon, and others have really raised the awareness of people with respect to the search capabilities and it is that awareness that makes for a much easier discussion now in the enterprise, with respect to information discovery.  Than would have, if people were not aware about the power of search to able to comb over terabytes of information and find almost the right information, very quickly.

John Furrier – Founder of PodTech.net
That’s the key.  Finding the right information at the right time.  Great prediction, accelerate relational models with high consumer awareness.  Thank you so much for the Podcast.

Anant Jhingran – IBM

Thank you very much. 

 For more on IBM’s IT Services visit: http://www-1.ibm.com/services/us/index.wss/gen_itservice

Author: John

Entrepreneur living in Palo Alto California and the Founder of SiliconANGLE Media

2 thoughts on “IBM’s Anant Jhingran – CTO & VP Information Management Group – Podcast Transcript”

Leave a comment