Web Intelligence Reputation

There is no doubt about the benefits provided by the Web, particularly for companies: it allows the dissemination of commercial information so that products and services can be advertised or sold. Moreover, because the Internet gives a reliable perception of reality, any business can analysis the information published online to assess how it is perceived by the market, and this can provide a strong competitive advantage: companies can used online research to enhance their image in terms of marketing or to simply improve the quality and features of their products.

A New Method of Data Analysis

A computer is a machine with artificial intelligence and can therefore perform functions and operations similar to those of the human mind. In order to facilitate the analysis of texts, PCs would have to be given the capacity of study and interpretation that individuals develop during their school years.

The data comprehension process involves analysis of a text on four different levels:

  1. grammatical analysis: this allows a grammatical sense (verb, adjective, noun, article...) to be given to each segment of the text, thereby removing lexical ambiguity;
  2. logical analysis: this recognises the role that groups of words play within the text and answers questions such as where? how? when? and who?;
  3. semantic analysis: this allows a meaning to be assigned to the right syntactic structure and, consequently, to the linguistic expression, eliminating semantic ambiguities;
  4. analysis of sentiment: this allows the polarity of the content regarding an individual, a product or a brand to be determined (positive, neutral, negative).

Clustering techniques are then used to classify various types of comments into groups (e.g. complaints or suggestions), thus creating new keys for interpretation of the data.

Knowledge Mining: A New Methodological Approach

This new approach to the interpretation of data consists of two phases:

  1. a mining phase: examining relevant texts as if they were a mine to be explored;
  2. a knowledge phase: identifying the information of real importance and any connections that were initially hidden.

The approach involves the use of a crawler, software that analyses the content of the network methodically and automatically, placing it in an index: it then analyses all the data collected and subdivides it according to relevance and importance in order to understand its meaning. The importance of one item of information compared to another is not identified on the basis of certain keywords: everything depends on the contextualisation of the information and its automatic comprehension.

The knowledge mining process, which allows the data online to be found and interpreted in terms of quality, quantity and reputational sentiment, can be summarised as follows:

  • Study of the context, in order to select data on the Web in line with the object of the search;
  • Exploring the web with a crawler: study of the content, separation and classification of what is relevant;
  • Interpretation of content in terms of quantity and quality;
  • Decoding the polarisation: evaluation of data collected in terms of quality, through the recognition of expected and unexpected results.

Notes on Confidentiality of Information

All information collected in this report, as result of searching post in blogs, in forums, social networks or news stories, have come from the public domain, and as such are accessible to anyone.

Report SMALL

Sources consulted:
open sources (Internet, major search engines, social networks).

Output:
  • Negative: "No information of interest regarding the subject was found."
  • Positive: a short collection of the evidence found in graphical format with links to the source.
Completion Time: 3-5 days

Report MEDIUM

Sources consulted:
  • open sources (Internet, major search engines, social networks);
  • press records from over 4,000 national and local newspapers published in a period of up to ten years (e.g. 2004-2014).
Output:
  • Negative: "No information of interest regarding the subject was found."
  • Positive:
    • a collection of the evidence found in graphical format with links to the source;
    • a copy of the article/s and details of the publication.
Completion Time: 5-7 days

Report LARGE

Sources consulted:
  • open sources (Internet, major search engines, social networks);
  • press records from over 4,000 national and local newspapers published in a period of up to ten years;
  • detrimental factors of a confidential nature from intelligence activities (combined with journalistic interviews, if available).
Output:
  • Negative: "No information of interest regarding the subject was found."
  • Positive:
    • a collection of the evidence found in graphical format with links to the source;
    • a copy of the article/s and details of the publication;
    • clear indications of the types of detrimental factors that have emerged.
Completion Time: 8-12 days