Alceste
Alceste is a software
for the Textual Data Analysis, developed by the C.N.R.S. with the
support of the ANVAR . This is an important tool for the automatic
analysis of textual data (open questions, litterature works, magazine
articles, essays, etc.).
The software Alceste can be used in Sociology, Psychology,
Surveys processing, Speeches analysis, Advices in Marketing,
Advertising, Journalism, History, Law, Linguistics, Medecine,
Documentary research, Press analysis, and finally in the fields there
are lots of texts to process.
The aim is to quantify a text to extract
its strongest significant structures so as to draw the essential information
contained in teh textual data. Research has shown (J.P. Benzecri ,
M. Reinert) that these structures are closely linked to the distribution
of the words in a text and that this distribution is rarely done at
random. Describe, classify, assimilate, synthesize automatically a
text is the interest of ALCESTE.
Method
used
| The Descending Hierarchical Classification
(HDC) is the method used by ALCESTE. This method carries out successive
splits of the text. It finds the strongest vocabulary oppositions
in the text and then extracts some categories of representative
terms.This methodology does not require a priori knowledge about
the text to be analysed. |
Characteristics
ALCESTE analyses all types of texts
captured with a word-processing, a scanner or by speech recognition.
It runs on Win 95, Win 98, Win NT4, Power-Macintosh and UNIX operating
systems.
Ergonomics, user-friendliness and the new
look of its graphical interface, its strong and high-performance basic
functions give to the software a fundamental structure and make of it
a relevant textual data analysis and interpretation aid tool.
Alceste has a panoramic screen that
sums up the main part of the results, that allows to scan and have a
general view so as to compare, select, edit, zoom in, export the various
results to write a final report.
A graphic module of Factor Analysis of Correspondances (FAC) allows
to visualize, leak out, hone, confirm, interprete the main part of the
results.
Thanks to its dictionaries (french, english,
spanish, portuguese, italian), Alceste comes up to the needs of any
user of the Textual Data Analysis software who want to process corpuses
in different languages. Those dictionaries are provided
and can be personalized.
Functions
Vocabulary analysis
This is the first step of the processing in which
the following operations are carried out:
- the counting of the words,
- the counting of the vocabulary
roots, after reduction,
- the creation of the dictionaries,
- words will be recognised according to their
grammatical category.
Standard Analysis
The software ALCESTE performs this standard analysis,
which is a "typical" and relevant analysis that contains two
classifications (double classification) so as to avoid any influence
due to the splitting of the text and to guarantee stability.
After the analysis of the vocabulary and the split of
the text, ALCESTE goes into the classification phase, in order to find
the strongest oppositions between the words and extract classes of terms.
This analysis gives for each class the following main
results :
- the most significant words and
sentences (using chi-squares to measure the importance of the links),
- the recurrent segments,
- the concordances of the most characteristic words.
Cross-data analysis
A cross-data analysis with Alceste allows to cross
a form (word) or a variable with the whole of the text.
In case of a form, the cross-data
analysis divides the corpus in two parts ("classes"), one
containing the form,
and one that does not contain it.
In case of a variable, for example X, with the modalities
X1, X2, X3,..., Alceste divides the corpus
in function of the variable modalities. As a consequence,
there are as many "classes" as variable modalities.
Examples of analysis
Qualitative Surveys : Analysis of interviews,
open questions, magazines articles, TV debates during the presidential
elections etc.
In Socio-psychology : Analysis of
semi-directive interviews, interviews, children stories, dreams, nightmares
etc.
In Sciences : Analysis of technical texts, reports, accounts,
medical diagnosis, system breakdowns and messages in computer science
etc.
In Literature : Analysis and synthesis of books, poetries, plays,
philosophical texts etc.
Multilanguages texts : Analysis of texts in French, English,
Catalan, Spanish, Gascon, Italian, Portuguese, German, Russian, etc.