This year our podcast went through a major change. The B2B Market Research Podcast became B2B Revealed. Our focus broadened to a myriad of issues that impact our clients in the B2B technology sector. The result? More interviews, more thought leaders, more insight, and an even better B2B Podcast.
It was a good year for B2B content. As we reflect on 2017, we bring you a countdown of our most read articles of the year.
Mid-market marketing is no easy task. Especially if your product is the first of its kind.
However, a savvy marketer knows how to effectively market a product in a new category. Matt Ipri is an expert in doing just this.
As the vice president of marketing and business development at Decision Lens, Matt brings a lot of first-hand experience to the table. In this episode, Cascade Insights CEO Sean Campbell chats with Matt about strategic mid-market marketing.
Become a Master of Mid-Market Marketing.
Listen To Learn How To:
- Target multiple industries with your website.
- Maximize the effectiveness of your current content.
- Build separate strategies for market awareness and product awareness.
- Work well with the analyst community.
- Handle competing with your customer.
- Reap the benefits of creating a new category.
Understanding customers’ key buying criteria is vital to having a competitive edge. To give the people what they want, you have to know what they want.
In this podcast, we cover ethics issues unique to the B2B market research and competitive intelligence space.
Episode #102 of the B2B Market Research Podcast
During this podcast, we cover:
- What Agile is and what it can do for your B2B market research efforts.
- Why Agile is no longer limited to the technical software industry or product development.
- How to leverage Agile for fast, optimal “no surprise” results for clients.
Thank you for listening to this episode! If you enjoyed it, please feel free to share it using the social media buttons on this page.
We would also be VERY grateful if you could rate, review, and subscribe to the B2B Market Research podcast on iTunes, Stitcher, or TuneIn.
Sean Campbell – CEO of Cascade Insights
Zach Simmons – CEO of Discuss.IO
This podcast is brought to you by Cascade Insights. Cascade Insights specializes in market research services for B2B technology companies. Our specialization helps us to deliver detailed insights that generalist firms simply can’t match. To learn more about us, visit our company profile. Also, be sure to check out our free market research resources and don’t forget to sign up for our newsletter.
With us today is Zach Simmons, the founder and CEO of Discuss.IO. Discuss.IO is a very interesting company, not only because of what they do, but also how they do it.
Zach, why don’t I give you just a minute here to explain your role in the company and what you guys offer.
Certainly, and thanks for that. Discuss.IO is really laser-focused on making the synchronous interview process more efficient and more simple. What we specialize in is having a single online platform where individuals can recruit, host, and execute interview sessions — and then analyze those results. We put it all in a nice DIY package that’s backed by our support team. This allows us to provide researchers and branding teams with a platform that provides Agile market research.
Excellent. I think you might know where I’m going to go with my first question because of how we originally bumped in to each other, and that question centers on the importance of Agile.
I think a certain percentage of our listener base is saying to themselves right now, “I know what Agile is.” These are probably people that come more from a technological background, maybe even originally from a product development background. Then there is probably another section or our audience base that says, “well, you just mean how you can be more fluid, and be more dynamic when conducting research efforts.”
But Agile is a much bigger and broader subject than that. Given that, would you unpack the importance of Agile: what it is, and how it contrasts with Waterfall, etc.?
Sure. That’s a very broad question, but at its core Agile is a new product development methodology. Agile differs from traditional Waterfall, like you mentioned, in that your process and turn around time — the horizon I should say – is significantly shorter when you are dealing with an Agile product development effort.
This process really impacts the downstream individuals that participate in concept testing and product testing, and matching that cadence is quite important at a business-unit level.
At a very high level, what we’re looking at is a fundamental difference in how products are brought to market. This approach started in tech but it’s continuing to migrate throughout the entire business landscape. It was pushed, in part, by books like Eric Ries’ The Lean Startup, which has helped popularize Agile-based methods in non-tech based sectors.
Where you stopped there is a really key point. I think one of the things I’ve noticed, and I know you’ve noticed, is that the Agile mindset is impacting product development in a big way. Perhaps most in technology and software companies, but it’s broader than that, as Agile is now extending its reach into other business processes.
…the Agile mindset is impacting product development in a big way.
These processes are being asked to move forward using this very iterative and Agile heartbeat as well. I think that lines up with pretty well with what you guys do, in the sense you’ve obviously noticed something similar even in the world of market research.
Given that, please share your thoughts on how Agile is extending its reach into other types of business processes and how Agile is not just a software industry or product development thing anymore?
Exactly. Market research and the marketing function as a whole really have to march in step with what the product development life cycle looks like.
Market research and the marketing function as a whole really have to march in step with what the product development life cycle looks like.
We have to start with that context as researchers, or as any support function. We have to recognize that if our end clients are ultimately changing how they are developing their products then we have to, as support functions of that life cycle, go and match the same cadence as they have.
Those kinds of impacts are where we see market research being shaken up because traditional agencies, by and large, have not really adjusted their processes around how they bring results to their clients. We’re still very much in the world where most agencies are not Agile-ready, and they are still taking months to go through, set up, and execute a project.
Ultimately, that turn-around time is the key difference that we see from agencies that are focused on Agile: they understand how it works versus those that still are working within a Waterfall methodology of 20 years ago.
Exactly. Generally speaking, as competitive intelligence professionals, I think that’s one of the things that’s a little bit different between CI and B2B market research vs. traditional B2C focused market research, let’s say. We’ve always been a little more investigative, and more “agile,” simply because the samples we target are harder to reach, hence the approach has always been a little more iterative.
Obviously everyone hasn’t quite taken it to the degree we’re talking about, where you are working in two-week sprints and structuring research so it can done in chunks, thereby matching the stories, timelines, etc.
Let me switch this to what I think is a really important part of Agile projects, which is the relationship with whom Agile folks would call the “product owner,” whereas a market research team would call it the “key stakeholder,” and same with the CI team. So how do you produce Agile insights and then deal with the consumption of those same insights in an Agile way? Because one of the challenges with Agile, of course, is you have to have an involved product owner. Otherwise, you basically just ship a bad product in smaller chunks.
I find organizations like the idea of Agile, but they have a harder time actually doing their part when it comes to Agile.
So, what’s the best way to deal with the consumption of those insights? Because, I find organizations like the idea of Agile, but they have a harder time actually doing their part when it comes to Agile.
That’s a very good point. Let’s take a step back and first talk about the concept of the product owner.
The product owner is the keeper of the product vision. They are the individuals that are driving, walking, and tackling the development of the roadmap and those kinds of things. Basically, the product owner is not just a stakeholder, but also the one with the most questions. When you are supporting that product owner they should be able to identify what the critical questions are for the next two- to eight-week time frame.
Normally, they’re answering individual questions and are not looking for exhaustive conclusions. Rather, they are focused on developing a continuous strings of insights, and obviously the questions will change as they’re navigating their own roadmap.
What’s a good set of best practices when it comes to coaching the stakeholders and product owners on what they are about to receive?
Those are all really good points. Let me ask you a follow-up question: What’s a good set of best practices when it comes to coaching the stakeholders and product owners on what they are about to receive?
I imagine, like most things that are different when it comes to business processes, that people carry their preconceptions with them a little bit. I imagine there might be a little bit of, “Well that’s great! You’ll give me more stuff more quickly, faster, I’ll see it more often, and I’ll have as much time to decide as I used to.” And they think they have all the time in the world to give you feedback.
How do you guys go about essentially educating? Obviously you’re working with the agencies in this regard, but I imagine there’s a lot of discussion about how those agencies educate their end stakeholders.
Exactly right. It’s very much a training and “feeling each other out” scenario, and we learn as we go with our clients.
One of the key characteristics is the concept of ongoing deliverables. Small pre-selected kinds of things that you build a set of studies or deliverables in your mind; having that steady stream of deliverables so you are always actively engaged as more than just the market researcher, but you are the information partner that is delivering insights throughout that process.
Your deliverables may not even be clear to you as you enter into an engagement for something that’s three months down the road. Rather you have committed to, under some kind of retainer model or such, creating that type of recurring insight — and you’re again a partner at the table when people are doing what they call the “sprint planning cycle.” Being involved and knowing what those critical questions are that the product team need answers to as they’re starting to form is really the best way to insert yourself as, again, a key partner in the process.
Being involved and knowing what those critical questions are that the product team need answers to as they’re starting to form is really the best way to insert yourself as, again, a key partner in the process.
True, but one of the things I’m sure listeners are thinking about is this: What happens when my stakeholders, in singular or plural, have left the building? Perhaps for no fault of their own; they’re just completely unable to interact with me, perhaps for weeks at a time. Is your recommendation that the project, by necessity, must pause if there’s no meaningful backlog to work through, or if there’s a real chance that the story prioritization has to really change and you have to wait to get said feedback?
Because I think one of the “benefits” of the larger scale projects, if I might put it that way, is that some people think the vendor will continue to work without my interaction to some degree. Agile presupposes that the stakeholder / product owner is fairly well engaged, and that you and the research vendor have an ongoing seat at the table.
But we both know that in the real world, sometimes the stakeholders check out and it’s not so much their fault as they’ve just moved to other things for a period of time. So if they don’t have the ability to pay attention to the project, or to tell whom they report to, I imagine that can create some real challenges in terms of project flow.
Yeah, this is a very good point that listeners should pay attention to. It’s very important you develop a relationship with your product owner and your stakeholders that recurring and ongoing research is something that happens whether or not they are involved directly.
Having some kind of database of questions, we will say for simplicity sake, is vital because the moment that a week goes by without any contact or deliverable – and if the bus just stops – then you’re dead. The whole project no longer is a continuous stream of deliverables.
You have to make sure that you, as an insight professional, have accumulated your own backlog, so to speak, of deliverables — and therefore you have a relationship in play such that the product owner knows that every other Friday there will be some kind of deliverable on their desk. If they’re really engaged it will be the exact deliverable that they wanted, if they were less engaged it might not be. They have the expectation that it is on them to help nurture and support you as a partner searching for the same insights that are keeping them up as a product owner at night.
One of the really big benefits that I want to make sure we don’t forget to mention is that Agile avoids surprise endings — unlike a Waterfall-oriented market research or competitive intelligence study.
Yeah, excellent point and good coaching, too. One of the really big benefits that I want to make sure we don’t forget to mention is that Agile avoids surprise endings — unlike a Waterfall-oriented market research or competitive intelligence study. When the vendor goes off for x number of weeks or months and comes back and presents to the room full of executives, there can be a lot surprises at that point, and sometimes not always pleasant. What you are sharing may be accurate data and insights but it can still create surprise.
Hence I think one of the benefits of Agile is that you have the ability to socialize and essentially avoid surprise while still getting insight. I think that’s a really powerful coupling, especially when you are talking about research and large companies as your client.
This is actually the original reason why Agile was created over a decade ago. Software developers were running into the exact same thing. They’d run off… take in requirements, talk with some users, spend nine months building something, and then there was a giant reveal.
They kept missing the mark. Projects would go over, they would be expensive. This is exactly what Agile is there to solve, and market research has paralleled to that same place. So today, you need to create an ongoing relationship and ongoing deliverables that engage the stakeholder. That engagement helps socialize the information, which makes sure that there are no surprises, and that helps to create buy-in. It’s a much more collaborative process and relationship with your ultimate consumer of those insights, and ultimately more productive as a result of that engagement.
I agree, and I think that’s a good place to end it. With that, thanks for joining the podcast and thanks to our listeners for being along with us — hope to have you along on the next episode.
What have I learned after recording 100 competitive intelligence podcasts? A lot. Here are the top ten B2B Market Research podcasts and the things I learned behind the scenes.
In this podcast I cover:
- What I’ve learned from producing 100 podcast episodes.
- The top 10 posts as defined by Google Analytics and social sharing data.
- Which topics resonated most with you and why.
Thank you for listening to this episode! If you enjoyed it, please feel free to share it using the social media buttons on this page.
We would also be VERY grateful if you could rate, review, or subscribe to the B2B Market Research podcast on iTunes, Stitcher, or TuneIn.
Sean Campbell – CEO of Cascade Insights
After reaching 100 podcast episodes, what have I learned? That’s the subject of today’s podcast.
This podcast is brought to you by Cascade Insights. Cascade Insights specializes in B2B market research services for B2B technology companies. Our specialization helps us deliver detailed insights that generalist firms simply can’t match.
Back in January 2012, I launched the Competitive Intel podcast. Hence, I’ve been producing this podcast for over three years now, which has led to 100 episodes.
Given that, I thought it was the right time to reflect on some of the most popular episodes (based on Cascade Insights’ Google Analytics statistics), and even talk a little bit about what it’s been like to produce a podcast.
With that, let’s get into the top 10.
The top 10 competitive intelligence podcasts
1. Our most popular podcast is the one I did focused on the 5 essential truths of CI analysis.
I based that podcast in large part on an article that a CIA analyst had produced, reflecting on his 40 years in the intelligence business. I talked about each one of the truths he mentioned in turn. And the one that I want to return to is his third truth, “good analysis makes the complex comprehensible, which is not the same thing as simple.” That is a really, really good point to remember. Again, if you want to see the five in total you can always go back and revisit the page where we’ve got that podcast hosted.
2. Our next most popular podcast is the one I did focused on the needs of product managers.
Product managers have questions that run deep, especially in the B-to-B tech sector. I covered a few of these in the podcast, 15 Competitive Intelligence Questions Product Managers Need to Answer.
In the podcast, I looked at a few things — for example, what kind of assets drove customers to pare down the list of vendors they actually engage with? This is a sign of when you’re perhaps losing early in the process and not even aware of it.
I also went into things like what personas were involved in the buying process and what roles were involved. Additionally, I discussed other factors, including the cost of the solution and the key buying criteria that drove the customer to consider a competitor’s offer.
3. The third most popular competitive intelligence podcast was the one that I did on the changing nature of the B to B sales and marketing process. I interviewed the author of The Challenger Sale for this, and talked a lot about how competitive intelligence must adapt to these changes. This remains one of the more popular podcasts in terms of actual social shares, even though it’s third on the list in terms of actual page views.
4. The next most popular podcast in the fourth position is the interview with Mark Smith on NodeXL. The NodeXL podcast is popular because NodeXL is of the best tools you can find that provide a “quick” introduction to social network analysis. That’s a funny way to put it, because I don’t think it’s really a quick way to do social network analysis. But NodeXL is the tool gets the closest to it, given it’s just embedded right in Excel.
5. The next most popular podcast is when I talked to Kris Wheaton from MercyHurst. MercyHurst is a great institution, one of the best ones out there when it comes to “growing” competitive intelligence analysts. I had a really interesting discussion with Kris about what makes a good intelligence analyst and the program at MercyHurst overall. That podcast was obviously of interest to a lot of folks.
6. The next most popular competitive intelligence podcast is when I talked about hiring world class analysts. In the podcast I talked about a lot of things that go beyond just pure analytical horsepower, down to the level of, “What’s the kind of person you’re hiring, and can they be a really effective consultant?” This, I believe, is key when you’re building a shared services team like a CI team.
7. The seventh most popular podcast was my interview with Wayne Jones of IBM. I think what probably struck people about this podcast is when Wayne talked about the research agenda and how important it is to establish a really good one if you’re going to build a well performing CI team.
8. The eight most popular podcast was on uniting competitive intelligence and market research efforts. I think the reason this podcast is so popular, is people have struggled to figure out the boundary between these two disciplines. The outer edges of each discipline is pretty easy to identify, but the boundaries between them as they merge into one another…well…that’s a little harder. I talked about marrying the customer-centric view of market research with the outward view of competitive intelligence — and I can only assume that that was one of the key things that resonated.
9. The ninth most popular podcast in terms of page views — although interestingly enough, this is one of the more popular ones in terms of actual podcast audio plays — is Three Key Ways that Competitive Intelligence is Different Than Spying. I talked a lot about how competitive intelligence professionals disclose their identity, how they practice their craft, and walked through a lot of the differences.
I think this is a question that many people have about competitive intelligence, including where the line is in terms of ethical and non-ethical behavior.
10. The tenth most popular podcast, 5 Fundamental Truths, distills key takeaways from the CIA analyst’s article I discussed at the top of this post.
What I’ve learned about podcasting…
In closing, I’ve also learned a lot of things about making a podcast. Audio quality is increasingly important. It’ll never be perfect, especially when you’re doing interviews — particularly if those folks are remote and in other countries –just because audio quality issues aren’t fully within your control. But I constantly strive to make those things as best as I can.
I’ve also learned that you like transcripts just as much as you like the podcasts, and I think there’s a simple reason for that. For some people, I think sometimes it’s easier to read it than listen to it. Hence, I’ve made transcripts available and will continue to make them available for our archived podcasts.
What’s the road map for the next 100? Well, maybe some better intro music and perhaps closing music. But if I actually get to that, I’ll probably have too much free time on my hands. :)
Then I’ll probably do some more interviews with luminaries in the CI space, and will also talk about how various disciplines interact with the world of competitive intelligence.
With that, I want to thank you for listening to this podcast.
Photo thanks to: iabusa
Today, we’re featuring a conversation Sean Campbell, Cascade Insights’ CEO, recently had with Thomas Miller. Thomas is a faculty member at Northwestern University and has written a really fascinating book on predictive data analytics called Web and Network Data Science: Modeling Techniques in Predictive Analytics.
So my first question for you, Tom, is if you would share with us a little bit about the book and also about yourself.
Tom: Certainly. Well, the book that Sean’s referring to is my web and network data science book, which was published at the end of last year, 2014. It covers two areas: web analytics and social network analysis (referred to as network science), and tries to bring them together to show the overlap. There is, of course, a section in there and it’s kind of a mutual admiration society. We refer to Sean’s book on Going Beyond Google as well.
What we try to do in the web and network data science book, which is intended for a course we do at Northwestern, is provide students with the right kind of skills and overview that they need to gather information from this humongous resource, the world wide web.
There’s just so much out there, and it’s hard to know how to pursue it, and you need to have the skills to do it. It’s not just point and click. You also have to have some computer programming skills to do it. (Well, to do it most efficiently.) That’s what the book is oriented toward — to provide that overview — and then we use it in the course. The update of the course will have the same name, Web and Network Data Science.
I’m also involved in independent consulting. I’m a kind of entrepreneur myself; I have a small company that is pretty much in startup mode right now, oriented around data science.
Excellent. I think the book you’ve put together is a great resource. I want to ask you a question that ties into your comment about this propensity among a certain type of individual. I’m not trying to be pejorative here, but I’ve met this kind of individual at conferences and in companies we’ve worked with over and over again. This person is fond of saying something like, “Well you know, Internet data is just data. You can’t derive any real meaning from it. It’s just information.”
Do you run into these types of folks? I imagine you do.
Tom: Well, it could be a matter of people being used to what was done in the past where they were doing primary research, design research, and custom research: they collected data for the purposes of a study. When we’re talking about the internet, we’re talking about this massive secondary data resource where you don’t have to design the study in advance. You’re using data that are already available.
That’s part of it, I think, is this education that’s needed as far as the value of secondary research generally and secondary information sources.
I think the other part of it is a lack of understanding of unstructured or semi-structured text, and what you can learn from that.
I think the other part of it is a lack of understanding of unstructured or semi-structured text, and what you can learn from that. In traditional research, statistics courses that you may have had in your undergraduate days (and many people have had them in one form or another), they deal with spreadsheet-like data where it was in columns and traditional databases and relational databases, and they don’t do a lot with text.
The web is text. It’s mostly text. You have to dig into that, and you have to extract. You have to scrape.
The web is text. It’s mostly text. You have to dig into that, and you have to extract. You have to scrape. You have to do the kinds of things that we do in network data science to get some meaning out of it. You have texts, which are unstructured. Somehow you translate those into numbers, which are analyzable in your models. That’s a challenge, and people are not very well educated in that area.
Those are all really good points. Particularly the point about quantitative analysis vs. text analysis. The Internet is so text heavy, and people sometimes don’t have the right tools to, in essence, turn that into something that’s a little more quantitative for them.
Could you describe the tools that you think are critical to this type of analysis? For example, I know you mention Python quite a bit in the book.
Python is used in the book, and it’s used extensively in data science. Python is primary. That’s because it is such a wonderful language for data munging and data preparation, sometimes called parsing of data. It’s a well-structured language — you don’t have a compile cycle, so you can do things more quickly.
Python is primary. That’s because it is such a wonderful language for data munging and data preparation, sometimes called parsing of data. It’s a well-structured language — you don’t have a compile cycle, so you can do things more quickly.
Prototyping is a lot faster in Python than in other languages, and it’s become the de facto standard you might say for systems programming work. It has replaced Pearl in many contexts. Because of that, it’s a good initial language to learn. With Python, there’s also the possibility of easily working with databases.
When we talk about databases in this context, we’re not talking about relational databases that you might look at in, for example, an accounting environment. In financial accounting, you have all these dollar signs and numbers to keep track of in rows and columns. What we’re talking about with text is we have to have an unstructured format or at least a more flexible structure. That’s either XML, which is markup language, or JSON — and JSON in the web world is primary these days because you can actually read it and read it with ease.
You have the general purpose language for manipulating text, for parsing: that’s Python. You have packages within that world of Python to extract data, so you can scrape (or crawl as it’s sometimes called) spidering the web to gather data.
Then you have to scrape the web pages because they’re marked up with HTML (which is structured like text) and you have to get rid of all those extra tags and codes and gather the data that you want — often from within the paragraph tag. That requires a good Python package to do.
With JSON, it’s readable and you’re matching up the codes. If you had, say, an email message, you have the “from” node, the “to” node, the cc and the bcc, and then you have the subject and you have the body and you have the web identifier. All of those are keys that can then link to the values associated with them. Some of those values are individual values like the from-node, or an entire array of values, like the to-node and the bcc and the cc.
All of that can go into this semi-structured, you might say, format called JSON. Then that’s that last file you have to understand, to understand the file structure, JSON. Now you’ve got it in JSON and now you want to put it in a database so you can actually do queries on it, and there are a lot of ways of doing that. In the databases, there are specialized databases like MongoDB, but also some other tools, for example, PostgreSQL where you can bring JSON in and work with that semi-structured text and JSON, within a traditional relational database as a JSON extension. You’ve got all that.
Now you have the other issue of querying the data. Well, how do you query text data? You’ve got to query text data with flexible tools.
The traditional database tools are not sufficient because they require a look for a specific value, when in text, you can say something in many ways and often.
The traditional database tools are not sufficient because they require a look for a specific value, when in text, you can say something in many ways and often. And if they were taking email messages, people are typing. Well, people make mistakes when they type, so how do you do a search across that? In that regard, you need to have an actual language-based tool, or, say, a tool like elastic search that’ll do an effective search across the email messages that you’re looking for.
All of those are kind of standard toolbox utilities that you need to have to work effectively in the domain of the web.
So how technical do you need to be to mine these type of assets? By that, I mean whether you’re talking about social network analysis or mining communities or web traffic statistics — or anything else for that matter — related to the subject at hand.
Because my experience has been it depends on the data source you’re looking at, as well as how structured that data is and whether the target website gives you an API or gives at least decent search capabilities. Hence, I know it’s not an easy thing to answer “Yes, you need to be technical” or “No, you don’t need to be technical.” As it obviously seems to depend a lot on the data source and the questions you’re asking.
Yes. I think many of the studies we’re talking about are team efforts, so first there’s a modeler. The modeler doesn’t always understand the IT part, the database part.
There’s the person who does the parsing to put the data into the database, and that person may not understand databases. We have forty faculty and all the skills we could imagine, but if I were to query one of those faculty members who may be teaching, say, the machine learning course and ask, “Could you take on a section of the database course this term?” often I get resistance: “I’m not a database person.”
It’s hard to know it all, and what you need to do is if you’re an independent consultant is outsource. You bring in the other expertise that you need because it’s not only hard to know it all, it’s also changing. Every year there’s some new technology that’s out there, the latest and greatest. One person can be aware of what’s out there, yes, but one person can’t be a specialist, an expert, in every one of those areas. It’s just too much.
One person can be aware of what’s out there, yes, but one person can’t be a specialist, an expert, in every one of those areas. It’s just too much.
As far as what you can do easily: yes, there are companies that provide APIs, easy access and search facilities, so you don’t have to know anything technical. You just have to type in the string. Think of all the intelligence that’s behind Google itself and the search engines that are out there. Bing and Google, they have all kinds of algorithms and intelligence underneath them to give you the results, but then anyone can get those results. Any of us can type that in. It’s the people with the most technical skills that have the ability to dig deeper and be more efficient in their searches.
How much do you have to know to do basic stuff? Maybe next to nothing as far as specialized skills. But if you want to do a really detailed study, often you’re going to need to know more. This division between the IT and modelers is a real one. We see it all the time in the field of data science, and some people come to data science from IT and some people come to data science from statistics and from modeling. Often, they don’t speak to one another. Often, they’re speaking different languages as far as computers. It’s a challenge, but both are needed to do the work.
Think about the network science part itself, where you’re looking at links and followers and following a Twitter chain or a Facebook chain, or a network on LinkedIn. To analyze those requires special skills in network data modeling. Their IT folks don’t have that, usually. A lot of the modelers don’t have that. Network scientists would have that.
That’s the interesting aspect, right? It’s like all new types of analysis efforts borrow a little bit from other disciplines but then are also inherently unique unto themselves.
So with that in mind, let’s address a few more areas. One is social network analysis. I think if you went to an average conference, discussion, or whatever that was on mining internet data, another logical sub-bullet to that would be, “Well, how do you do social network analysis?”
This is an area that the second you crack it open you start running into things like, “betweenness centrality,” and other things like that. Hence, I think some people quickly feel that social network analysis, in any meaningful way, is beyond them.
Given that, what are some of the baby steps people can take to at least become more aware of the space or the tool set? I mean, you’re educating students all the time, so when you start talking about social network analysis, how do you get them started in a way that they feel like they’re doing something meaningful in a fairly quick manner?
Well, to do something meaningful, you’re going to have to employ some sort of program. There is a variety of programs out there, with some easier to use than others. I look at open source; I look at programs that are used for other purposes as well, primarily. In that regard, there’s a Python package called NetworkX. In R there’s iGraph, and some specialized packages on top of that deal with social network analysis.
You start there and you start with examples that are already completed. I can talk about email a little bit. In the book, I show Enron. We do an analysis of Enron and we use R in that particular analysis. Students in my class learn, first of all, how to produce the example. Then I ask them to take it another step further, to do something new, to explore it more extensively.
Now, there are some nice packages out there. One in particular, Gephi, is interactive. It’s easy to use so you can just point and click a graphical user interface. You don’t have to know a lot of programming to use it, so one way to explore, to learn initially, is to export the data from Python into a format that this Gephi program can understand. Then you bring the data into Gephi and you point and click your way to a new analysis.
…so a network of 300 people, can get really complicated, dense and hard to understand unless you can dig into it and find your way to the most important players.
The big challenge in networking is finding the interesting subnets. That’s been a real challenge. Even a network of 300 nodes or people, so a network of 300 people, can get really complicated, dense and hard to understand unless you can dig into it and find your way to the most important players. That’s what you were talking about when you mentioned betweenness centrality. Betweenness centrality, eigenvector centrality — these ideas are in some ways related to what we understand about Google and the links in Google, and references and prestige and “who knows what” and “who’s getting referred to.” All of that is related.
These are just ways of finding our way to the influential people. It might be people with power, it might be people who know more. It might be effectiveness, but it’s people who are referred to more. That’s a tool. The statistics relating to the people or the nodes of the network are ways of getting to the more important ones. When you identify the more important ones, then you have subnetworks you can work with.
One approach is what’s called an egocentric network. You can reduce the problem down to a smaller problem where you’re looking at one person, so it’s your network, Sean, and the people that you know. Then maybe one step beyond that, one order beyond that. You look at all the links that you have of the people you know and then look over one more level to the people they know. As that network grows, maybe not all the way because you’re going to mess with a few orders or magnitude of length, you’ve got a very large network to deal with. That’s one way of approaching it. So have a focus.
We have a project in one of our classes, the data prep class, where we’re looking at the Enron email data set, which is public domain, and the students are asked to see if they can find the culprits. Where were the problems? Where did they originate? To whom were these people talking? You’re looking for connections. You’re digging for problems in accounting. Or digging for problems with the California blackouts and brownouts that were occurring at that time that were tied into Enron. The students dig in the text to find the people who were talking about those things, then they find to whom those people were talking, and it’s a lot of fun. It’s detective work. Social networks are used that way.
I think one of the reasons social networks are being talked a lot about is because marketers see them as a way of finding people, finding new buyers. If you buy an iPhone, your friends might more likely buy iPhones. It’s a higher hit rate in terms of the marketing effort to go to the friends of the friends of people who are current customers. There’s not a lot of, I think, general interest in the theory of networks so much as finding a way to new customers. That’s what I think is the driving force behind that interest.
I completely agree. I think that, especially in a business-to-business context, that’s a lot of where this is coming from and it’s a lot of where the commercial tool set obviously is pointing at this point.
One last broad question around predictive analytics: I think one of the things that intrigue people about this, beyond the examples you mentioned which are critically important, is also the fact that we can peer into the future using these types of approaches.
The example I give all the time is the one around search traffic statistics. If you had a competitor that was getting a three x increase in search to their website, specifically if they were selling a SaaS product where customers can just buy without talking to the sales team, that traffic increase is a clear indicator that there are going to be future sales for that company. Meaning that some of that web traffic will translate into a future sales number.
Hence, in many ways, web traffic analysis can be predictive because the sale cycle might last six to twelve months or twelve to eighteen months, so when you look at today’s statistics, you’re seeing a little bit of a view into the future – in terms of next year’s sales for the target company.
With that straightforward example in mind, what are some of the more interesting things you’ve seen when it comes to doing predictive analytics?
Yes to all of what you just said. Predictive analytics is a term I look at as synonymous with data science. What we’re doing is we’re taking a business problem; understanding of a problem, and the business, taking data and IT and we’re creating models and putting them together. Ultimately, what we have to do is speak to the business problem. We have to solve that problem. A lot of times that problem is, “What are we going to do about sales,” or, “What are we going to do about our competitors and how are we going to increase sales?”
The tools that we’re talking about in that area are largely traditional statistical models or machine learning models that do essentially the same thing in a more flexible way. We have explanatory variables. We have things that we understand right now. You mentioned one of them which might be search engine performance, however measured, and we have the response. Response variables are what we’re trying to predict, which is, say, sales, sales in the next quarter.
We look at the path, and we see how those explanatory variables related to the response, and from that, we build a model. The models that we build are sometimes regression models when we’re trying to predict sales or quantitative response. Other times, we’re trying to build toward a classification model or we’re trying to predict the group. It might be buy or not buy, it might be pay off the loan or not pay off the loan, some kind of categorical response, often binary. Which brand a person’s going to purchase, that’s a categorical and multinomial response.
We build models that are classification models in that context. All of these things are well understood and there are many methods to deal with these problems, and that’s essentially what we do. I have another book which just came out this month, Marketing Data Science, that shows how to do that chapter by chapter. There’s a business problem in every chapter and it shows what kind of model could be used.
Even traditional models can be used. You don’t have to choose the machine learning algorithms to get something useful. You can make predictions about “what if,” types of questions. For example, what if we reduced the price of this product by ten percent? What could we expect in terms of sales response given the competitive environment? You play those kinds of simulation games within the context of the model. They’re exciting, interesting, and very important applications of predictive analytics.
Those are all excellent examples. Well, I think we could keep talking for a while because we’re both pretty interested and invested in this space, but I’m going to have to wrap it for now. I want to thank you for joining me for this intriguing conversation!
Tom: Most welcome, Sean. I enjoyed it!
FOR IMMEDIATE RELEASE
Authors Sean Campbell and Scott Swigart to Speak at Annual Investigative Reporters and Editors Conference
Portland, Oregon, June 12, 2012 – Campbell and Swigart will join the best in the business in a hands-on workshop and conference hosted by Investigative Reporters and & Editors held in Boston MA June 14th – 17th 2012.
Each year the IRE (Investigative Reporters & Editors) provides investigative journalist with the tools and tips needed to uncover and break the world’s biggest stories. “We are excited and honored to have been invited to participate this year”, says Sean Campbell. “Scott and I feel the tools we use for competitive intelligence have a lot of value for investigative reporters”.
Campbell and Swigart are the acclaimed authors of Going Beyond Google: Gathering Internet Intelligence, now in its third printing. “To be an exceptional journalist you have to dive deeper, much deeper”, says Campbell. “Our contribution to the IRE Conference will be to help investigative reporters and editors find information beyond the cursory Google search and provide some immediate tools to quickly crunch the data once it’s located.”
The importance of mastering these skills will become increasingly important as the competition for breaking stories heats up during the presidential election.
About Campbell and Swigart:
Cascade Insights founders and publisher, Sean Campbell and Scott Swigart are recognized throughout the world as CI knowledge leaders and have lectured at literally hundreds of industry events, webcasts, and conferences throughout the world and are consultants for many of the Fortune 100 and 500 companies around the globe. They are currently working on their second book.
Investigative Reporters and Editors, Inc. is a grassroots nonprofit organization dedicated to improving the quality of investigative reporting. IRE was formed in 1975 to create a forum in which journalists throughout the world could help each other by sharing story ideas, newsgathering techniques and news sources.
About the IRE Conference: IRE 2012: Boston
By Sean Campbell
By Scott Swigart
Connect With Us
- Customer Experience Research
- — Buyer Persona Research
- — Buyer's Journey Research
- — Key Buying Criteria Research
- — Jobs-To-Be-Done Research
- — User Personas
- — Customer Satisfaction Research
- B2B Product/Service Research
- — Market Opportunity Research
- — Concept Testing
- — Go-To-Market Research
- Marketing Enablement Research
- — B2B Data-Driven Marketing Research
- — Message Testing
- — Brand Research
- — Thought Leadership
- — Partner Enablement