Character Network Analysis, Distant Reading and Infinity

In this post written on a rainy Sunday, I gather the concepts of mathematical induction and distant reading around character network analysis. Probably that it should have been divided into to smaller posts, but I found interesting to discuss in parallel scalability in mathematics and literary studies.

As a mathematician, there are many concepts I discovered in my studies or later that suddenly blew the view I had on the world. During my first year in gymnase (college), I didn’t get a single thing of what was happening at the physics course, where we were studying mechanics, i.e. an application of calculus, without having ever studied calculus before. But when we studied that subject in the mathematics course one year later, suddenly everything became clear: the derivative describes what is happening at a single moment, and by extension at all the others. It is the tendency of the quantity described by a function to grow, remain constant, or decrease. Then the derivative of the derivative… describes how the derivative behaves. Yep, you can do that infinitely. In the physics course, we had studied how to represent the position of a point in space (e.g. a car) at any moment in time by a formula. Deriving it would give us the speed of the car. Speed can be mentally computed by dividing the distance from point A to point B by the time it takes. But more importantly, the speed given by the derivative is the speed at one exact moment in time: it is what happens if A and B are so close that they get mixed up as the same point. Doing the mental computation here is problematic since you need to divide zero by something that looks like zero too. The derivative lets you know the speed of the car like if you were reading it on the dashboard.

I want to draw a parallel between one of these experiences and what I’m working on now, but I feel the need to keep sidetracking.

First year of grad school corresponds to the peak of mathematical mindblowness in my life (it is also the year when I studied the most). Having to prove everything you do and not letting one single detail pass through turned everything upside down in my life. Calculus was not about numbers, was not about calculating anything. It was about epsilon, delta, i, k, x_i, x_j, limits, series and the beautiful | meaning “such that”, among other. We were spending all our time proving. I began to see the world differently through mathematical lenses.

Unlike calculus, linear algebra was in majority made of pain in the ass pen and paper calculations, but after a few weeks you would discover that you had dived in spaces made of so many dimensions your brain had difficulties to see how to move through this. At first you were studying a space of 2 dimensions, then a space of 3 dimensions… only to realise that not so many things had changed. And then a space of 4 dimensions. (You see the pattern.) You can do whatever you want, you have received the means to find your way there through calculation, and maybe later you’ll “see” something (and you’ll help me find a nice and clean answer to this question people have asked me so many times: “Can you see in four dimensions?”). Ever done one of these magical PCA’s but never understood how it works? That’s linear algebra, and it’s never too late.

The second semester of the geometry course presented differential geometry: the applications and images were so beautiful that practicing for this course was like playing a game. It didn’t require much efforts (in comparison to physics); you were writing down abstract formulas and in return these formulas were shaping incredible stuff and would turn intervals of real numbers into weird creations in space. You would wander on these brainfucked manifolds (do I need to say that there is no limit in the number of dimensions here also?) and re-do your calculus and your car simulations during the whole sunny afternoon.

EXERCISE

If you want to have something to think about on the subject of dimensions, I can propose a small exercise: you just need to remember what perpendicularity is and we are ready to define hyperplanes!

There are many ways to define a straight line: one is to pick a vector, choose a point and that’s it. How does that work? Grab a sheet of paper: a vector is defined by a direction (choose a point anywhere on the physical border of the page), an orientation (towards that point or in the opposite direction) and a length. At this point of the exercise, you have drawn a nice arrow. This arrow is a vector, and you can put it anywhere you want on the page: it is the same vector if it has remained parallel to the original, has the same length and still goes with the same orientation. You do not need to fix it to a point. An infinite number of lines are defined by being perpendicular to this vector, but a unique straight line is defined by being perpendicular to the vector and passing through this point. That’s it for the two-dimensional case.

What happens in a three-dimensional space? Well, if you take a vector and a point, you find… an infinite number of straight lines. Huh, are we stuck at this point? On the contrary: what results from drawing all these straight lines is a plane, that is to say a two-dimensional space embedded into a three-dimensional space (We are used to “space” being only three-dimensional, but the more general definition of (Euclidean) space does not have this limitation.). What we had in the previous paragraph was a one-dimensional space (the straight line) embedded into a two-dimensional space (the plane). Now you have to believe me, and probably that you are seeing it coming: what happens when you take a vector and a point in a four-dimensional space? You are defining an infinite number of planes, and all these planes taken together form a three-dimensional space: a hyperplane of dimension 3. By extension, the plane we constructed before was a hyperplane of dimension 2 and the straight line was a hyperplane of dimension 1. Thus, what have you done? We have just manipulated geometrical objects in a space of more than three dimensions, and we have defined a way to describe a mathematical object at absolutely any dimension equal or larger than 2. Now before jumping to dimension 10, maybe take some time already to try to think about what this implies, about what a four-dimensional space looks like.

Zero, One, Infinity

We have seen that computation can be conducted on stuff we cannot physically apprehend (at least me). I could keep writing on how operational research transformed my everyday life into the eternal search for the shortest path, or how game theory made me see our society as a Markov chain of absurd decisions, but at this point I am going to focus on two ideas, one at the core of hyperbolic geometry, the other at the core of statistics, which I am convinced are somehow related. Then I talk about distant reading, promised.

Euclide defined geometry with five axioms from which all the rest could be deducted. Four of them are indisputable, but the fifth–stating that given a straight line and a point not belonging to that line, then there exists one and only one straight line parallel to the first line and passing through this point–lead to an open discussion: is it really an axiom or can we deduce it from the four others? Adopting an opposite approach to that question, in the 19th Century some mathematicians began to ask: what if we keep the first four axioms and we drop or replace the fifth one? What if, given a straight line and a point, we could draw more than one parallel? Or could not draw any? I studied the consequences of these questions during the first semester of the geometry course: stating that there are no parallels leads to spherical geometry (among others) while stating that there is more than one (meaning an infinite number) leads to hyperbolic geometry, which incidentally was the subject of my master thesis. This pattern has been stuck in my mind since: 0, 1, infinite. In my master thesis, I discussed one of the models used to represent hyperbolic geometry in two dimensions: the Poincaré half-plane model.

(Please skip the next paragraph if you are not in the mood for some mathematical nerdiness more or less unrelated to the subject in title. It looks like seven and half years later I need to fulfil some outburst of nostalgia.)

In this model, you draw a straight line and consider only one of the two halves that it defines. The straight line is the infinite, just like the farthest point at the end of any straight lines you would draw perpendicular to the “infinity” line. Now we define what a straight line is in this model, with this geometry, because it is not the straight line we know anymore. In the real world, the one from our college years, a straight line is often defined as the shortest way from one given point to another given point, extended infinitely in both directions. Here, this is more or less the same, but the metrics–the way to measure the distance between two points–has changed. For the sake of coherence, we are renaming the “straight line” by its mathematical name: the “geodesic”. In that model, given two points, there are two possibilities of geodesics (in fact one, but visually two), a fact that I am not proving here. I need you to believe me: a geodesic between two points is the arc delimited by these points that is part of the half-circle passing through these points and whose center is on the “infinity” line. The (visual) second type of geodesics occurs when the two points are aligned on a Euclidean perpendicular to the “infinity” straight line: in this case the geodesic passing through these two points is the perpendicular itself. Now what about parallelism? Well, given a geodesic and a point that does not belong to this geodesic, a parallel to the geodesic passing through the given point is any geodesic that does not cross it: draw any half-circle with the center on the “infinity” line, containing this point and not crossing the geodesic and that’s it: you have your parallel. Thus now you can see that there is an infinite number of possibilities.

For your information, spherical geometry happens on a… sphere, and the geodesics are the great circles: all the largest circles you can draw; they cut the sphere in two equal parts. And you can verify: it is impossible to draw two great circles that would not intersect. Parallelism does not exist there.

You often find this pattern in mathematics, since zero is the neutral element for addition, one the neutral element for multiplication, and the infinite the place where all the stuff you don’t control disappear. Most of the time, this can be linked to mathematical induction: given a statement you want to prove, zero is a special case, one is a starting case that is easy to check. Then, if you can prove that when assuming the statement is true for any natural number n it implies that it works for the number n+1, you have proved it for any case up to the infinity.

One, Two, Infinity

In a recent past, I was teaching assistant in a Statistics course for first year grad students in Psychology. It lasted five years, and by the time I moved somewhere else I knew the structure of the course by heart. However, it is only when reaching the end of the Ph.D. thesis writing epoch that this structure became an influential transferable mathematical pattern like the numerous examples given earlier in this blog post.

We studied statistical hypothesis tests during the whole second semester. The tests were classified along two dimensions. The first one was the type of variable: nominal, ordinal or numeric. The second one concerned the number of samples: one, two, or more. (For the sake of precision, there was also a special case: one sample but two measures instead of one.) The course proposed a statistical test for each combination and the whole thing was presented in a nice table that made their exam revisions easier. Here, the connection I am trying to infer may be perceived as exaggerated, but I cannot help thinking that all this is related, taking the risk of finishing this post on an open ending.

One, two, infinite: this is how I see the construction of methods aimed at analysing corpuses going from one to all the literary works. The need to be capable of detailing one work, of comparing two works (implying by iteration the comparison of many more), and the will to never stop increasing the stream of novels in order to be able to compare them all. Can it be linked to the zero, one, infinite impression?

Nowadays, I am exploring the relatively recent concept of character network analysis, that is the study of the characters of a novel, with a focus on their relations. Most of the time, a character network is the model of a novel’s discourse: it positions characters one to another based on their interactions in the text. There is a lot to ask and answer on what this object means, what it represents relative to the character-system, to the discourse and to the story. In my works, I focus more on network analysis and statistical methods, and this is were all this mathematical preamble links in my opinion to distant reading (but I do not know if this helps somehow). Here is the approach I have started following when working on character network analysis methodology.

One Work

Studying one work is essential to set and define the construction process. Reading and annotating that work before confronting it to the resulting network(s) must have been done a few times in order to understand that process and maybe improve it. Indeed, this concerns more than one novel, but this is not applicable to all the novels we want to study, since hopefully this quantity is huge. The extracting method can be manual, but preferably automatic. This is the right moment to think and test various possibilities since it does not require too much work and computing time.

Then comes the facultative question of how to visually represent it, which is easily solved in the case of a corpus of one.

Eventually, the other challenging part is to develop the analytical framework: which measures should we use, when, and for what results? Is it relevant to import methods from social network analysis (SNA), should we adjust them, should we start from scratch? I have started studying centrality measures and a few other commonly used methods, and I believe that the importation from SNA is possible and necessary, keeping in mind of course that what we obtain is not a social network, not even a fictitious social network of the society of characters in the story, but a representation of the character-system, i.e. the way the author organised them in the narration.

Two Works (Comparison)

This is the moment when we do not only compare characters’ positions one to another but when we reach a higher level and think about features summarising each network. This is distant reading showing up in character network analysis methods and its future developments. How do we compare two networks? Can we do that visually? (A hint: no.) At this point of the discussion, there are few existing articles dealing with such questions, and I will soon add a small one by presenting a conference paper at Sidney’s DH 2015 conference with the character networks for the twenty novels of Émile Zola’s Les Rougon-Macquart, where I develop/adjust methods to detect cores of protagonists and show that they help differentiate the character networks based on that. The research questions that are to be asked are various, but in my opinion can be brought back to general questions such as the classification of character networks, and in a way of their corresponding novels (tread lightly).

I should not forget to mention that at this level of observation it becomes more than necessary to have an automatic method in order to build the character networks. In my case I use back-of-the-book indexes of characters: it has the advantage of providing a disambiguated table of occurrences but the disadvantage of making scalability nearly impossible (at least with manual indexes).

Many Many Works (Massiveness)

Yay, this is what we want: all the books! Big data! It will probably arrive soon. This is the distant reading ideal: how does the number of protagonists evolve with time, and across genres and countries? How centralised or distributed are all these characters-systems? Can we build a relevant classification? Then we reach the trial of distant reading itself: can we discover something new about the theory of characters or the history of literature? Can we confirm or contradict assumptions that have been done by University Professors having read only a few tens of thousands novels?

Conclusion

In some cases, the necessary workload to create a framework for the analysis of one work is equivalent to the workload needed to extend it to the analysis of any number of them. Many studies have done the job for one or a few character networks. We are soon going to see appearing character networks for a whole genre or an entire epoch compiled and compared, and maybe one day character networks based on the constructing methods for all the fiction novels ever digitised.

As I feared, I haven’t solved what is a mystery for me. In fact I haven’t tried very hard. Is the “one, two, infinite” rule the distant reading equivalent of the “zero, one, infinite” rule in hard sciences? The same pattern or a sibling one? Does it allow a transfer of methods? Am I right to care about that or just a bit too obsessed by patterns in science? Maybe we should we call it distant reading induction, but I am quite convinced that this is redundant by definition.