Un estratto della cerimonia di consegna dei dottorati (PhD) tenuta il 10 giugno 2016 presso il Politecnico di Milano. Un’esperienza unica e incredibile che mi ha permesso di vivere esperienze fantastiche. Sono moltissime le persone che devo ringraziare, tutte piacevolmente citate nella tesi, ma in particolare il ringraziamento più grande va alla mia Prof.ssa Licia Sbattella, per aver creduto in me.

Di seguito l’abstract della mia tesi, svolta presso il Dipartimento di elettronica, informazione e bioingegneria del Politecnico di Milano, nel campo dell’intelligenza artificiale (NLP) con applicazione nel contesto forense.

ABSTRACT:

This thesis presents DIKE (Description of Interrogations by Knowledge Extraction) a tool that aims at analyzing examinations in a court of law by extracting relevant information. This work aim at the design and implementation of a conceptual model able to describe relevant information, and useful to understand how forensic examinations develop. DIKE is based on an original multi-dimensional conceptual model, which represents several aspects of examinations according to psychological, juridical, and linguistic theories. For implementing such a model, we created a new audio/textual annotated corpus, using real examination recordings and transcriptions coming from Italian trials, and annotated with sentence- and utterance-level labels. Such a corpus permits to automatically annotate new examinations by means of original multi-level, HMM- based model. Audio and transcriptions are automatically aligned by means of an original algorithm, conceived for low quality and noisy audio recordings. The multi-level HMM-based model leverages and combines speech and textual features, classifying the examination according to several dimensions. DIKE permits to highlight dialogue sequences where the speakers experienced crises, as a consequence of deliberately provoked or unwanted stressful events. DIKE, as a didactical tool, permits to improve examination techniques, by generating in a novel way a rich description of examinations, used to define a profile for the examination and a profile for each of those partaking in the examination dialogue. Summing up, the main contributions of this thesis are: the definition of a new multi-dimensional conceptual model representing examinations under different points of view; a new audio/textual, annotated corpus composed by real forensics examinations; an alignment algorithm tailored to noisy environments; a specific multi-level, HMM-based model; and a new educational tool for calculating, visualizing, and analyzing speaker’s and dialogue’s profiles.