X

Unveiling The Structure Of Information: The Fundamental Scale

Information has been a matter of intense study and discussion for the last century. The very nature of information, not a physical object, not an entirely abstract entity, has fostered this endless discussion which had some of its first episodes back in the 40’s with the arguments between Wiener and Shannon.

Whether information is the effect produced by the received pattern of signals or the pattern of signals itself, the truth is that these patterns exist and we are devoted to deep into them to find a model for their structures and to establish useful comparisons among them.

The Fundamental Scale

We have defined Scale of a description as the number of different symbols used in the description. The scale then is closely related to the written symbols –or the symbols used within the pattern of signals- the observer is able to identify. As an example, if a reader interprets an English text he or she may read it letter by letter or word by word. In these cases, the text would be interpreted at the scales of characters and words respectively. This definition offers important advantages in our studies about information.

A specific scale is the one that maximizes the amount of extracted information from a given description. This results from a unique way of reading the description using a particular set of symbols that minimize the computed symbolic entropy. The number of reading combinations to consider in order to select the one that minimizes the entropy is astronomical. But a genetic algorithm to find this set of symbols have been developed and published in 2015 by Febres and Jaffe. The Fundamental Scale is the resulting set of symbols and its quantitative value plays an important role as the base of the logarithm of Shannon’s famous entropy equation.

Comparing Literary Texts And Music

During the lasts years, information conveyed through languages of different nature have been compared.  To achieve the computer processing of the signals of different nature, the comparison is performed using the written versions of each language. Thus, for speeches pronounced in natural languages as English or Spanish, the information is handled by using the texts representing the speeches. Music and sounds are studied by ‘reading’ the texts associated with the file produced when the sounds sequence were recorded. Of course, interpreting these apparently meaningless texts, especially for music and sounds, is a painstaking task. Yet, the Fundamental Scale Algorithm allows for the unveiling some of the structure and behavior of different languages.

By considering combinations of neighbor characters forming symbols of diverse length, the Fundamental Scale Algorithm ‘scans’ practically all ways of interpreting a text written in one dimension. The algorithm makes no assumption about the rules governing the presumably unknown language used to write the text. When the algorithm finishes it returns the set of symbols – character sequences- which produces the minimal entropy, thus ‘squeezing’ the text to extract the maximum possible amount of information.

While the use of other observation scales, as for example the scale of words, which functions well to transmit semantic information, the Fundamental Scale reveals the language’s structure mostly developed to be capable of delivering symbolic information. The Fundamental Scale allows us to gain a better understanding of the capacities and behavior of languages of different nature. When comparing natural languages with music, we have found a wider entropy distribution for music. Perhaps a result of the more relaxed grammar content of music as compared to natural languages. However, the diversity of symbols exhibited by natural languages is greater than for music.

The Fundamental Scale has proven to be useful also for classification purposes. Ordering the fundamental symbols by their frequency of appearance, the so-called symbol profiles can be plotted. These profiles, similar to those obtained when the word-frequency of an English text is plotted to illustrate the Zipf’s Law, adopt a form that is unique to the language used to make a description. Different descriptions expressed in the same language also produce different profile shapes. The profiles built with the Fundamental Symbols serve as a classification and identification tool.

Spaces To Model The Representation Of Descriptions

Two types of space to represent complex descriptions have been explored. In the first type, the axes measure the ‘concentration’ of information. Thus normalized quantities as entropy and symbol diversity are the basis of these spaces. Another type of space serves to qualify how information extends and organizes to constitute a description. This space is then built with axes as resolution, scope, and scale. In some of our current studies, we explore possibilities for representing complex organisms by means of fractal dimensions. Thus we expect to show in a two or three-dimensional structure the whole multiscale shape of the complex entity or organism.

This article is based on a compilation of the research by Gerardo Febres, Klaus Jaffe at the Universidad Simón Bolívar, Venezuela, and Carlos Gershenson at the Universidad Nacional Autónoma de México. Two recently published studies on the topic, Calculating entropy at different scales among diverse communication systems and Music viewed by its entropy content, were published in the journals Complexity and PlosONE respectively.