Company Name: Universite PSL | Location: Paris | Type: Job | Last Date to Apply: 2021-02-28
Artificial intelligence for the Sciences” (AI4theSciences) is an innovative, interdisciplinary and intersectoral PhD programme, led by Université Paris Sciences et Lettres (PSL) and co-funded by the European Commission. Supported by the European innovation and research programme Horizon 2020-Marie Sklodowska-Curie Actions, AI4theSciences is uniquely shaped to train a new generation of researchers at the highest academic level in their main discipline (Physics, Engineering, Biology, Human and Social Sciences) and master the latest technologies in Artificial Intelligence and Machine Learning which apply in their own field.26 doctoral students will join the PSL university’s doctoral schools in 2 academic cohorts to carry out work on subjects suggested and defined by PSL’s scientific community. The 2020 call will offer up to 15 PhD positions on 24 PhD research projects. The candidates will be recruited through HR processes of high standard, based on transparency, equal opportunities and excellence. Description of the PhD subject: “Language Acquisition in Brains and Algorithms: towards a systematic tracking of the evolution of semantic representations in biological and artificial neural networks” Context – MotivationChallenge. In the less than 400,000 years, our species learnt to master fire, build tools and disperse across the planet. This sudden behavioral shift likely results from a specific cognitive breakthrough: the human brain, unlike other species, is uniquely developed to learn and share information across individuals and generations and, thus, to contribute to and benefit from an incremental development of cultural and technological artefacts. This is at least the hypothesis putforward by influential linguists (Chomsky, 2014), artificial intelligence (AI) researchers (Turing, 2009) and cognitive neuroscientists (Dehaene, 2020). Specifically, it is hypothesized that our species evolved a unique ability to perform semantic composition combine a limited set of known elements to form a novel and meaningful representation. Speech, in particular, depends on our ability to recursively combine successive words into a complex meaning. Although the order of these putative operations “syntax”) has been under extensive scrutiny in neuroscience (Pallier, Devauchelle and Dehaene, 2011; Ding et al., 2015), how the human brain learns to perform “semantic composition” remains largely unknown. Breakthrough. Three methodological advances may help us address this historical challenge. First, the rapid progress in deep learning and natural language processing (NLP) now provide concrete and testable models of language processing and acquisition (Caucheteux and King, 2020). Specifically, deep neural networks trained to predict words from a given context the preceding sentence) excel in automatic translation and achieve remarkable summarization, question answering and dialogues (Devlin et al., 2018).Second, intracranial recording devices and analyses have improved considerably over the last decade. It is now possible to track and decode speech processing with remarkable precision (Chang et al., 2011).In particular, Nelson and Dehaene (Nelson et al., 2017) using such intracranial devices have described the neurophysiological dynamics of phrase-structure building during sentence processing. Our partner institution, the Hospital Fondation Adolphe de Rothschild, has an expertise in drug-resistant epilepsy in children treatment, a condition requiring long lasting intracranial recording of the brain activity. This partner offers the unique opportunity to track language processing throughout the course of brain development, with spatio-temporal resolution that largely exceeds previous investigations (Dehaene-Lambertz, Dehaene and Hertz-Pannier, 2002).Third, several lines of research have demonstrated that deep learning algorithms generate representations that are directly mappable onto brain activity. For example, James di Carlo and his team have shown that infero-temporal spiking activity linearly correlates with the activations of deep convolutional neural networks trained to recognize objects from natural images (Yamins et al., 2014). Similarly, we have recently shown (Caucheteux and King, 2020) that human brain activity linearly correlates with the activations of deep transformer networks trained on language modeling. Together, these studies show that AI algorithms find brain-like solutions, and could thus (1) help us understand its computational organization and (2) benefit from neuroscientific studies (Richards et al., 2019). Scientific objectives, methodology & expected resultsObjective. Equipped with testable models of brain responses to language processing, we will, for the first time, investigate withintracranial recordings, how the representation of speech evolves throughout human development.Approach. Brain recordings.The Hospital Fondation Adolphe de Rothschild hosts a highly specialized unit dedicated to recording and treating drug-resistant epilepsy in children from 2 to 20 years-old. Similarly to adult patients, localization of the seizure onset zone is crucial in order to remove it during a neurosurgical procedure. Identifying such a zone often requires to perform a long lasting iEEG recording during a week (depth intracerebral electrodes implantation). Sometimes this recording has to be repeated, offering a unique opportunity of a direct recording of the brain activity at different periods of its development. Modeling. Children will listen to pre-recorded sentences and narratives such as The Little Princeby Antoine de Saint-Exupery while being recorded with iEEG. The auditory stimulus will be fed to speech and language algorithms such as Wav2Letter andBERT (Devlin et al., 2018). Finally, a linear mapping will be learnt and assessed via cross-validation to test whether the activations of the deep neural network systematically correlate with brain responses. The systematic quantification of this model-to-brain mapping across 1) children’s age 2) brain regions 3) corpora and 4) deep learning architectures, will allow us to test whether deep neural networks learn to represent language similarly to humans.The analysis of such complex and big data will requires a high level of expertise and willbe performed in the Département d’Etudes Cognitives (DEC) at the Ecole Normale Supérieure (ENS – PSL) member of the Paris Sciences et Lettres (PSL) University. Artificial intelligence concepts have been applied over the past few years to both human and animal neurophysiological datasets (including intracranial signal) using innovative approaches in this field. In particular, functional connectivity analysis based on Information Theory concepts and hierarchical analysis of neural networks through unsupervised learningcould lead to the detection of previously undetected patterns and be helpful to understand the neural code (Bourdillon et al., 2020). Consequently, this interdisciplinary and intersectoral project will be co-supervised by both an academic institution (PSL) and a nonacademic one (Hospital Fondation Adolphe de Rothschild). The academic development of new artificial intelligence analysis will thus find a concrete application in the data-science department of the hospital and a unique opportunity for PSL to team up with one of the only teams recording intracranial brain activity in human children.Following standard practices, the experiment will be submitted to the ethical committee prior to the beginning of the PhD program in order to optimize its feasibility. As nothing is modified apart from the possibility for the children to listen to a story during their stay at hospital, the ethical issues will be limited. Consent from the patients and their parents will be essential prior to enrolment.The first year of the PhD program will be dedicated to the design of controlled experimental stimuli optimized to reveal the evolution of compositional representations in the human brain. In parallel, thePhD candidate will develop signal-processing skills on existing intracranial recordings.The acquisition of new data will be possible during the second year of the PhD program (25 recording planned to be included among the 40 performed every year) and will be moderately time consuming (possible automatization of the dataacquisition), the PhD candidate will have time to develop his or her skills in artificial intelligence.The last year could be fully dedicated to data analysis, collaboration with other partners (such as Facebook Artificial Intelligence Research) and scientific writing. It is worth to notice that the iEEG are already routinely performed at Hospital Fondation Adolphe de Rothschild and that the PSL lab has already a large experience in linguistic experiment building and data science, making it realistic to do this project in three years. International mobilityCollaboration on development of new Artificial Intelligence analysis of intracranial dataset in this field of interest have already started with Facebook Artificial Intelligence Research (FAIR). The PhD candidate will be encouraged to deepen this collaboration and to have an active role in this development. Thesis supervisionJean-Rémi King and Pierre Bourdillon PSLCreated in 2012, Université PSL is aiming at developing interdisciplinary training programmes and science projects of excellence within its members. Its 140 laboratories and 2,900 researchers carry out high-level disciplinary research, both fundamental and applied, fostering a strong interdisciplinary approach. The scope of Université PSL covers all areas of knowledge and creation (Sciences, Humanities and Social Science, Engineering, the Arts). Its eleven component schools gather 17,000 students and have won more than 200 ERC. PSL has been ranked 36th in the 2020 Shanghai ranking (ARWU).Required Research Experiences RESEARCH FIELD Computer science YEARS OF RESEARCH EXPERIENCE 1 – 4 Offer Requirements REQUIRED EDUCATION LEVEL Computer science: Master Degree or equivalent REQUIRED LANGUAGES ENGLISH: Excellent Skills/QualificationsBackground in fourth-generation programming language (Python and Matalb appreciated), in big data analysis and in modern artificial intelligence concepts(deep and shallow learning…).Knowledge and experience in signal analysis processing (especially MEG, EEG or intracranial EEG) is welcome as well as basic knowledge in linguistics.