me

Mickaël LAURENT, PhD



Last updated on July 19, 2024

About

Hi! 👋 I'm a french researcher in computer science. I am currently working with Jan Vitek at Charles University.

  • 🤓 I’m interested in programming languages and type systems.
  • 👨🏻‍🏫 I defended my PhD at IRIF under the supervision of Giuseppe Castagna and Kim Nguyen (more info).
  • ⚙️ Please try the web version of my type system prototype!
  • 🕹️ On my free time, I like playing cooperative video games and watching speedruns and e-sport competitions.
  • 🛠️ I also practice reverse engineering and related stuff: glitch hunting, Tool Assisted Speedrun, ROM hacking, etc.
  • 🔗 GitHub - ORCID - YouTube
  • 🤡 I also like Easter eggs.

You can also read my resume, and if you are interested in type systems, my thesis manuscript.

And if you are gourmand, here are some generous types:

map: (('b -> 'a) -> [ 'b* ] -> [ 'a* ]) & (Any -> [  ] -> [  ])
filter: ('a -> Any) & ('b -> Any \ `true) -> [ ('a | 'b)* ] -> [ ('a \ 'b)* ]
flatten: X1 -> [ 'a* ] where X1 = [ X1* ] | 'a \ [ Any+ ]

Publications

Polymorphic Type Inference for Dynamic Languages 5 January 2024

Giuseppe Castagna, Mickaël Laurent, Kim Nguyen POPL 2024

We present a type system that combines, in a controlled way, first-order polymorphism with intersection types, union types, and subtyping, and prove its safety. We then define a type reconstruction algorithm that issound and terminating. This yields a system in which unannotated functions are given polymorphic types(thanks to Hindley-Milner) that can express the overloaded behavior of the functions they type (thanks tothe intersection introduction rule) and that are deduced by applying advanced techniques of type narrowing(thanks to the union elimination rule). This makes the system a prime candidate to type dynamic languages.

On type-cases, union elimination, and occurrence typing 12 January 2022

Giuseppe Castagna, Mickaël Laurent, Matthew Lutze, Kim Nguyen POPL 2022

We extend classic union and intersection type systems with a type-case construction and show that the combination of the union elimination rule of the former and the typing rules for type-cases of our extension encompasses occurrence typing. To apply this system in practice, we define a canonical form for the expressions of our extension, called MSC-form. We show that an expression of the extension is typable if and only if its MSC-form is, and reduce the problem of typing the latter to the one of reconstructing annotations for that term. We provide a sound algorithm that performs this reconstruction and a proof-of-concept implementation.

Revisiting occurrence typing 1 May 2022

Giuseppe Castagna, Victor Lanvin, Mickaël Laurent, Kim Nguyen Science of Computer Programming

We revisit occurrence typing, a technique to refine the type of variables occurring in type-cases and, thus, capture some programming patterns used in untyped languages. Although occurrence typing was tied from its inception to set-theoretic types—union types, in particular—it never fully exploited the capabilities of these types. Here we show how, by using set-theoretic types, it is possible to develop a general typing framework that encompasses and generalizes several aspects of current occurrence typing proposals and that can be applied to tackle other problems such as the reconstruction of intersection types for unannotated or partially annotated functions and the optimization of the compilation of gradually typed languages.

Merit and Blame Assignment with Kind 2 19 August 2021

Daniel Larraz, Mickaël Laurent, Cesare Tinelli FMICS 2021

We introduce two new major features of the open-source model checker Kind 2 which provide traceability information between specification and design elements such as assumptions, guarantees, or other behavioral constraints in synchronous reactive system models. This new version of Kind 2 can identify minimal sets of design elements, known as Minimal Inductive Validity Cores, which are sufficient to prove a given set of safety properties, and also determine the set of MUST elements, design elements that are necessary to prove the given properties. In addition, Kind 2 is able to find minimal sets of design constraints, known as Minimal Cut Sets, whose violation leads the system to an unsafe state. We illustrate with an example how to use the computed information for tracking the safety impact of model changes, and for analyzing the tolerance and resilience of a system against faults.

Experience

Postdoctoral researcher July 2024 - Now

Charles University Prague, Czech Republic

Designing a type system for the language R. Supervisor: Jan Vitek

Temporary Teaching and Research Assistant (ATER) September 2023 - August 2024

Université Paris Cité Paris, France

Continuing my research on a set-theoretic type system for dynamic languages, along with teaching 166h to undergraduate students (Licence 1 to Licence 3).

Participation to a summer school in Oregon. Lectures were given by experts in the programming language area.

Research internship: Traceability analysis tools in Kind2 October 2019 - July 2020 (10 months)

University of Iowa - Electrical and Computer Engineering Iowa City, USA

Development of traceability analysis tools in the SMT-based model-checker Kind2. Supervisor: Cesare Tinelli

Research internship: Occurrence typing using set-theoretic types March 2019 - July 2019 (20 weeks)

Laboratoire Methodes Formelles Saclay, France

Formalisation of a type system featuring occurrence typing and set-theoretic types. Supervisor: Kim Nguyen

Research internship: Interactive loop invariant synthesis March 2018 - July 2018 (5 months)

Carnegie Mellon University - Cylab Pittsburgh, USA

Conception of a method for interactive loop invariant synthesis, in a decidable fragment of first-order logic. Supervisor: Bryan Parno

Research internship: Counterfactual causal analysis June 2017 - July 2017 (8 weeks)

Carnegie Mellon University - Computer Science Department Pittsburgh, USA

Counterfactual causal analysis in a rule-based language modelling cellular signaling. Supervisor: Jean Yang

Teaching

Fête de la Science 2023 - 2024 (12h)

Université Paris Cité Paris, France

Introduction to some concepts of Computer Science to primary and secondary school students.

Programmation réseau (PR6) 2023 - 2024 (24h TP)

Université Paris Cité Paris, France

Basics of network programming in C.

Principe de Fonctionnement des Machines Binaires (PF1) 2022 - 2023 (24h TD + 12h TP), 2023 - 2024 (24h TD + 12h TP)

Université Paris Cité Paris, France

How binary machines work (logical formulas and circuits, computer arithmetic, and some information theory).

Programmation Orientée Objet (POO3) 2021 - 2022 (24h TD + 12h TP), 2023 - 2024 (48h TD)

Université Paris Cité Paris, France

Oriented Object Programming (OOP) in Java.

Internet et Outils (IO2) 2020 - 2021 (24h TP), 2023 - 2024 (48h TP)

Université Paris Cité Paris, France

Basics of web development, using HTML, CSS, PHP and MySQL.

Initiation à la programmation Python (IP1) 2020 - 2021 (48h TP), 2022 - 2023 (24h TP)

Université Paris Cité Paris, France

Python programming for math students.

Concepts Informatiques (CI2) 2021 - 2022 (24h TD)

Université Paris Cité Paris, France

Some concepts about data structures, memory representation, and compilation.

Education

PhD of Computer Science 2020 - 2024

Université Paris-Cité, at IRIF and LMF Paris, France

Polymorphic type inference for dynamic languages: reconstructing types for systems combining parametric, ad-hoc, and subtyping polymorphism

Master of ENS Paris-Saclay (Computer Science) 2017 - 2020

Ecole Normale Supérieure Paris-Saclay Cachan, France

Master MPRI (Algorithms and Foundations of Programming) 2017 - 2019

Ecole Normale Supérieure Paris-Saclay Cachan, France

Bachelor of Computer Science 2016 - 2017

Ecole Normale Supérieure Paris-Saclay Cachan, France

MPSI/MP* (Preparatory classes for the Grandes Ecoles) 2014 - 2016

Lycée Champollion Grenoble, France

Baccalauréat S (High School Diploma) 2011 - 2014

Lycée Paul Héroult Saint Jean de Maurienne, France

Projects

Type checker prototype for dynamic languages 2021 - Now

A type checker prototype for a custom functional language, implemented in OCaml, and featuring:

  • Set-theoretic types (with intersection, union, negation and recursive types)
  • Ad-hoc polymorphism to capture overloaded behaviors
  • Parametric polymorphism to capture genericity
  • Precise typing of type-cases and pattern matching (occurrence typing)
  • Type inference (with optional user type annotations)

Some missing features (and thus, future work):

  • Support of side-effects
  • Support of abstract data types (with nominal subtyping)
  • Gradual typing
  • Row polymorphism for records

Arbitrary Code Execution for generation 3 Pokemon games 2021 - Now

I think the game I played the most when I was a kid was Pokemon Ruby, on GBA. With my brother, we completed the regional Pokedex, with 200/200 Pokemon catched. Unfortunately, there still were two Mythical Pokemons that we were unable to catch, and for a good reason: those were accessible only for those attending some physical events, occuring only in some places and sometimes only in Japan. This was the early time of forums, and lot of rumors started to flourish: you can unlock Deoxys by beating the Battle Tower, by waiting for the white stone of the spatial center to be stolen, etc... But rumors, that's all they were.

In 2021, I got interested into the glitching scene, that is, how to exploit bugs and flaws in games in order to do things that the developers did not plan. That's where I discovered that this wasn't a lie: it is indeed possible to catch Deoxys without attending physical events nor using any cheating device! There are several methods to achieve it, but the most reliable and powerful way is via Arbitrary Code Execution (ACE). It is the Holy-Grail of glitchers: exploiting the game until we are able to make it execute any code we want! If I find some time, I'll write something about it, because the process to get there is very interestig in a technical point of view. What's ironic is that it is only possible to achieve that thanks to... the protections added to the game by the developpers in order to make cheating more difficult (in particular, a very basic kind of ASLR and some checksum verifications) 🤦‍♂️

But achieving ACE is not an easy task: you have to write some code in the memory, and then make the game execute it. Both aspects are very interesting, but the tool I made focuses on the first one: how to write machine code (ARM7TDMI) in a specific location in memory. For that, we use a mechanic of the game that allows us to rename boxes of the in-game PC (where we store our pokemons): it consists of 14 names of 8 characters each. Those 14 names are stored consecutively in memory, one byte for each character, but are separated by a 0xFF byte (the EOF character). Thus, it takes in total (8+1)*14=126 bytes.

The main issue is that only about 1/3 of the 256 possible values for each byte can be written, due to the character encoding used by the game. So, we have to write machine code (using one of the ARM or Thumb instruction sets), but by being unable to write 2/3 of the possible byte values... That sounds like an impossible constraint! Fortunately, while it seems very complicated to write Thumb code with this constraint, it is possible, to write some very useful ARM opcodes: some arithmetic ones (ADC,SBC), and some memory-related ones (MOV,STR(H),LDR(H)). Of course, only a small subset of registers and immediate constants are accessible due to the constraint on the charset. But it is enough to write abritrary code: by doing the right arithmetic operations using MOV, ADC and SBC, we can compute any opcode we want in a register, and then write it a little ahead of the current instruction (PC register), using a command STR, for it to be executed.

Another constraint is the presence of a 0xFF byte (EOF), which cannot be modified, at the end of each box name. It forces us to insert some dummy machine instructions that do nothing but contain this EOF byte. After that, we just have to convert the machine code we obtain into characters using the character encoding, and we are done.

This whole process is what the tool I made automates. It parses some ARM assembly code written by the user (with some preprocessor extensions), finds a way to write immediate values using writable arithmetic operations, assembles them and encodes the result, for finally showing the box names one has to enter in the game! Using this code generator, it is also possible to setup a much more powerful ACE environment, for instance by bootstrapping a code that allows you to get rid of the limitations cited above. You can take a look at this video that shows a full exploitation of ACE in order to inject a new Pokemon species in the game!

Rom Hacking: Level editor for Kuru Kuru Kururin 2020 - 2021

In 2020, one year after my TAS of Kuru Kuru Kururin, I decided to investigate the code of this GBA game furthermore and make an editor that allows to edit the maps of the game. Armed with an emulator and a debugger, I tried to document where and how maps are stored, to extract the tiles and palettes for each level, etc. Finally, I managed to make a level editor!

Screen of the editor

My friend mohoc7 used it to create an insane(ly difficult) ROM-hack that you can download here 😁 (and it works on a real GBA: I've tested it)! I have also adapted the level editor to support the sequel of the game, Kururin Paradise, with the help of Dimedime!

Bots for the Gravitron of VVVVVV (DQN vs simulation-based) 2020

The source code of the PC game VVVVVV got officialy released in 2020. It was announced by my friend mohoc7, during his speedrun of this game at the AGDQ 2020. This motivated me to dig in the source code and see what I could do... and I finally decided to make a bot for the Super-Gravitron (a mode of the game where we have to avoid projectiles)!

I first made a Python bot using an implementation of a DQN in order to learn to play this game, as it was done by Google for some Atari games (link). For that, I built a customized version the game where inputs are taken from the console, and where game states are logged in the console too (Github).

Then, deceived by the result (though it was performing way better than the average human, it did not break the ~10min record of my friend!), I decided to make another bot, this time without using machine learning. I replicated the physics of the game in Julia (which offers better performance than Python), and made a bot that searches a trajectory based on this simulation of the game. This time it broke the record... 💪

Tool Assisted Speedrun of Kuru Kuru Kururin (GBA) 2019

In 2019, I made a Tool Assisted Speedrun (TAS) with a classmate and dear friend, mohoc7. While mohoc was the expert of the game and director of the TAS, I was the technical team. If you watched the video of the TAS, you might be thinking "well, this game is totally broken, look how it is easy to clip in the walls!". But if you try it yourself, you might be surprised how difficult it is! Actually, clipping in a wall without the help of a moving object is not humanly feasible (update: it actually is). In a TAS, we can tell the emulator to press on each frame the buttons we want: the challenge is to find a sequence of button presses that finishes the game as quickly as possible. Even with an emulator that can press buttons for us, clipping through a wall is quite complicated: for instance, another TASer, Ryuto, made a TAS years before ours without even noticing such a mechanics was possible.

So, how to achieve wall-clipping? You can find an explanation of how it works on tasvideo.org, but the conclusion is that there is no systematic way to achieve it! It relies on a very precise combination of position, velocity and collision mask, and the only way we found in order to be able to achieve it in different situations is to make a bot that tries a lot of paths. At first, I tried to make a script that communicates with the game in order to test different paths, but we abandonned this approach: it was too slow. Instead, I replicated the exact physics of the game, from the logic of collisions to the computation of trigonometric functions with the right precision!

In order to search paths, I basically used an A* algorithm with a custom cost map computed with a flooding algorithm. Some optimisations and approximations were necessary in order to reduce the search space, but in the end, it worked! However, due to the approximations and heuristics used, there is no guarantee of optimality in the results produced. Stil, this was a great challenge, both algorithmically and technically, and I really like the result 😁!