What Competitive Intelligence Can Learn from Investigative Reporters
News reporters use an intriguing mix of tools that directly correlate to competitive intelligence work, particularly in terms of mining the web. In short, investigative reporters have deep expertise in gathering data from its hiding places and making sophisticated conclusions from it. The competitive intelligence field can learn a lot from theirs through events such as the Computer-Assisted Reporting Conference, which we recently attended.
Scraping Data Rapidly from Web Pages with Scrapely
Screen scraping lets you rapidly gather information from a site and put it into a different form. Tools like Scrapely, a Google Chrome browser extension, let users rapidly gather even very large data sets from websites, in many cases simply by highlighting a chunk of data and selecting a command from the right-click menu. The tool outputs a neat table automatically, which you can then export to Google Docs.
For more complex page formats, Scrapely includes the ability to create a parser from a set of examples that can be automatically applied to similar pages. Note that you should check the terms of use for any data source you intend to use with Scrapely or other data scrapers before using them.
Automating Information Grouping with Google Refine
In most cases, competitive intelligence practitioners don’t suffer from a lack of data; more often, the challenge is to make sense out of an ocean of it. Computer algorithms are the clear answer, and no one creates those better than the folks at Google. When you have 1,000 headlines and 20,000 Tweets on a competitor, Google Refine can analyze them automatically and place them into clusters, according to topic.
The time savings associated with an automated approach such as what you can do with Google Refine can easily make the difference between having to simply take samples from a body of data, versus being able to detect the patterns and underlying significance from an entire data set.
Visualizing the World of Data with NodeXL, Twiangulate, and Junar
The primary factors associated with how and why customers are being driven to a specific competitor are often hard-to-quantify ones, such as reputation and referrals. To understand that dynamic, you need to know how influence is being generated, and by whom. NodeXL graphs the social-network connections from an influencer to other people. Twiangulate shows mutual followers and friends between two or more influencers’ names you enter.
Another challenge is that most online data changes constantly, so reports may rapidly become outdated. Junar helps by collecting web data, automatically placing it into dashboards, and updating it dynamically as the source changes. Tools such as these can dramatically improve your ability to tame huge data sets and make fast, high-quality conclusions.