Identifying Latent Semantics in Action Games for Player Modeling
Abstract
Machine learning approaches to player modeling traditionally employ highlevel gameknowledgebased features for representing game sessions, and often player behavioral features as well. The present work makes use of generic lowlevel features and latent semantic analysis for unsupervised player modeling, but mostly for revealing underlying hidden information regarding game semantics that is not easily detectable beforehand.
Keywords
Action game semantics, Latent semantic analysis, Latent dirichlet allocation, kmeans clustering, Lowlevel feature representation, Player modeling
Introduction
Player modeling has been attracting the interest of game design and development experts for several years, as a means to increase player satisfaction and immersion. According to the inclusive reviews in [1,2], modeling techniques vary from empirical (datadriven) [37], where the application of machine learning or statistical analysis to gaming data enables predictions of playing styles, to theoretical (i.e. analytical), mostly applicable to boardlike games, where search and optimization techniques are used to determine the moves towards the best outcome [8].
Regarding empirical approaches to player modeling, various learning techniques have been experimented with; Supervised, like support vector machines for predicting difficulty adjustment [9], Bayesian networks for classification [10], statistical analysis of the distribution of player actions [11], and unsupervised, like selforganizing maps [12], reinforcement learning [13], transfer learning [14] and preference learning [15]. Supervised techniques (standalone or in combination with unsupervised approaches) have been gaining in popularity [1621] during the last threefour years, compared to purely unsupervised approaches [12,22,23], mostly due to their improved performance. Dimensionality reduction techniques, other than selforganizing maps, have been experimented with for unsupervised modeling: Linear Discriminant Analysis has been applied to arcadestyle as well as combatstyle games [24], where match data are annotated with the players' identity to enable the supervised application of Linear Discriminant Analysis, and then kmeans clustering groups together players of the same gaming style.
All previous approaches use a limited number of highlevel game and player features to perform modeling, that are gamedependent (vary from game to game, and a game expert is required to define them) and whose impact on the player model is to some extent apriori sensed. The highlevel features pose significant demands on knowledge resources, while they minimize expectations to extract new knowledge and unforeseen relations and dependencies between game and player features.
The present work proposes grouping similar playing styles together by modeling the semantics of the game domain. There are two possible ways for supplying domain knowledge [25]: By hand, making use of domain experts' knowhow, and automatically, by deriving the semantics from large corpora of "word" sequences, i.e. sequences of concepts that carry units of meaning related to the domain. The first approach is more accurate, but domaindependent, while the second (adopted in the present work) is useful when no handcrafted knowledge is available.
In an attempt to disassociate the player modeling process from the necessity of an already known highlevel knowledgebased feature set, the present paper proposes.
 The use of numerous lowlevel generic screendistributed features, relatively easily derived from any action game, that are raw (i.e. they are unprocessed and don't undergo any averaging/rate counts etc.); Also, no sophisticated knowledge about the game is required (e.g. specific time lapses, specific player reactions, specific game frames and periods).
 The use of these features to model an action game domain knowledge, i.e. to semantically represent game states and game sessions.
 The application of Latent Semantic Analysis (LSA) to the game termdocument matrix in order to enable the revelation of hidden, previously unknown semantic relations among players, game states and sessions.
 The comparison of LSA to other semantic modeling approaches, like Latent Dirichlet Allocation (LDA) to note their differences in addressing the data sparseness problem.
LSA has been applied with significant success to several domains, other than IR, like essay assessment in language learning [26], intelligent tutoring [27], text cohesion measurement [28], summary evaluation [29], text categorization [30]. Although all previously mentioned LSA applications have been performed on text corpora, some approaches have proposed its use in different nontextual knowledge domains like board game player modeling [31], complex problem solving [32], gene function prediction [3335], web navigation behavior prediction [36], collaborative filtering [37], semantic description of images [38].
LDA has been applied to several distinct modeling applications, like topic detection [39,40], tag recommendation [41], entity resolution [42], human action recognition in video streams [43], spam recognition [44], satellite image clustering [45], flood event modeling [46], source code analysis [47].
Latent Semantic Analysis
In Information Retrieval (IR) the most important data structure is the termdocument matrix, where every matrix row corresponds to a vocabulary entry (the vocabulary or lexicon consists of the words/terms appearing in a document collection) and every matrix column corresponds to a document in the collection. Matrix cells may indicate term incidence (contain a Boolean value that denotes the presence or absence of a term in a document), term frequency (tfdenoting the number of times the term appears in a document) or more sophisticated weights that incorporate the discriminative power of every term, i.e. its ability to distinguish groups of documents based on their semantic similarity, like the widely used tf.idf metric, where idf is the logarithm of the inverse document frequency (dfthe number of documents a term appears in). The larger the number of documents a word appears in, the weaker its discriminative power, i.e. its ability to semantically mark a document. The following formula defines tf.idf, N is the total number of documents in the collection.
$$tfidf\text{}=\text{}tf*\mathrm{log}(\raisebox{1ex}{$N$}\!\left/ \!\raisebox{1ex}{$df$}\right.)\text{(1)}$$
In a realistic IR application scenario, the number of rows and columns of the termdocument matrix may reach tens or hundreds of thousands. LSA [48] is a mathematical method, initially proposed for reducing the size of the termdocument matrix in the IR setting, by applying Singular Value Decomposition (SVD). SVD decomposes the initial termdocument matrix A into a product of three matrices and "transfers" matrix A into a new semantic space:
$$A\text{}=\text{}TS{D}^{T}\text{(2)}$$
T is the matrix with rows the lexicon terms, and columns the dimensions of the new semantic space. The columns of D^{T} represent the initial documents and its rows the new dimensions, while S is a diagonal matrix (all nondiagonal elements have a zero value) containing the singular values of A. Multiplication of the three matrices will reconstruct the initial matrix. The product can be computed in such a way that the singular values are positioned in the diagonal of S in descending order. The smaller the singular value, the less it affects the product outcome. By keeping only the first k, i.e. the k largest, singular values, and setting the remaining ones to zero and calculating the resulting product, a lowrank approximation A_{k} of the initial matrix A may be calculated as a leastsquares best fit. The rank of the new matrix A_{k} is equal to the number k of selected singular values.
As an interesting side effect, dimensionality reduction decreases or increases the cell values of matrix A_{k}, compared to those of A. It may even set the occurrence of words to higher than zero for documents that they initially did not appear in. Thereby semantic relations between words and documents are revealed that were not apparent at first (latent). It needs to be noted that LSA is fully automatic, i.e. the latent semantic relations are learned in an unsupervised manner.
LSA for Action Games
Unlike boardlike and arcade games, action games have properties resembling those of complex/dynamic environments: Causality (actions/decisions often affect subsequent actions/decisions), time dependence (environmental circumstances that affect actions/decisions vary over time), and latent, implicit relations between domain properties that are not straightforward.
The first challenge is the identification of the set of domains "words", corresponding to the collection vocabulary in the IR domain. In board and boardlike games, like tictactoe or chess, domain "words" are easy to identify. Boards may be viewed as grids of cells and each cell state (e.g. X/O/empty in tictactoe) constitutes a "word" [25]. In action games "words" are harder to identify. In the firefighting microworld of [32] "words" are actions like appliance moves, or water drops. The definition of a game "word" depends on the intended use of the model. If the intended use is behavior prediction, a "word" needs to model a player's action, as the player's sequence of actions (in a given context) defines his behavior. Words need to be primitive carriers of domain knowledge and every word represents a distinct game state (a game screenshot at a given moment) in a unique manner. In the present work two ways for defining words are proposed. The first (henceforth the "holistic" approach) makes use of highlevel game features (e.g. position of player ship, number of enemy ships etc) and measures their value for each particular game state. The concatenation of all feature values in a given game state is the word. The second (henceforth the "grid" approach) considers each game state to be a grid of cells, i.e. the game screen is divided into rows and columns so that the resulting cell size can roughly fit the game elements (spaceships, weapons etc). Each cell may or may not contain one or more game elements, i.e. game state semantics is acquired by combining the semantics of each cell in a given state. The number of possible state conditions equals ${2}^{n*m}$ (m is the number of cells, and n the number of game elements). A word is the concatenation of all cell conditions for the given state.
The second challenge involves the definition of documents. In IR, documents are meaningful utterances that carry information regarding their topic, author, genre etc. Similarly, game documents need to constitute complete and meaningful player "utterances", i.e. wellformed sequences of words that constitute complete descriptions of player actions or context conditions. Herein these descriptions address entire game sessions as these carry a complete meaning and the mark of each individual player. A game session is represented by taking a sample of the game state at constant predefined time intervals (i.e. every 500 msecs) and registering the sequence of words ("holistic" or "grid") that describe the sample. Each sample represents a game state at the specified time point.
After words and documents have been defined, the termdocument matrix may be constructed. In this matrix each row is a word (i.e. game state) and each column is a document (i.e. game session). Two matrices are experimented with: One where each matrix element denotes the tf and one where it denotes the tfidf weight of a particular word in a particular document. The termdocument matrices are very sparse, i.e. only very few words appear in more than one document and very few words appear in a document more than once. LSA is applied to the matrices in an attempt to address this sparseness and extract the knowledge lying underneath it.
Having constructed the termdocument matrices, each game session may be viewed as a learning vector and kmeans clustering is applied to the initial and the reduced dataset so as to group similar game sessions together in an unsupervised manner.
Latent Dirichlet Allocation
LDA [49] is a probabilistic process for modeling corpora. Documents in a collection are treated as distribution mixtures over 'topics'. 'Topics' are latent notions that are represented as word distributions P(wt). Given a document d, the probability that it involves a mixture of topics t is P(td). The probability of a specific word w_{i} given a document d, can be calculated as:
$$p\left({w}_{i}d\right)\text{=}{\displaystyle \sum _{j=1}^{t}p\left({w}_{i}{t}_{i}\text{}=\text{}j\right)}\text{}p\left({t}_{i}\text{=}jd\right)\text{(3)}$$
where t_{i} is the latent topic and t is the number of topics, which is apriori userdefined. LDA models the above distributions using Dirichlet priors from an unlabeled collection of documents. As a result, the process outputs a documenttopic matrix and a termtopic matrix, i.e. a matrix containing membership probabilities of every document for every topic and a matrix with membership probabilities of every word for every topic respectively.
Unlike LSA, where the resulting featurevectors are sometimes hard to explain, especially when they involve negative values, LDA probabilities on topics and documents are more straightforward to understand and make use of in other applications.
LDA has been chosen for modeling the game semantics in this work, compared to similar methods, like word2vec/doc2vec modeling, due to its higher interpretability (LDA results in topic probabilities, while word2vec results in a vector of real numbers that are hard to interpret) and its globality^{a}. In other words, LDA has the ability to model global dependencies (documents globally predict words), while, in word2vec, the modeled dependencies are local (one word predicts another word nearby). Given the complexity of the action game semantics, global dependencies need to be uncovered and are more useful.
^{a}https://multithreaded.stitchfix.com/blog/2015/03/11/wordisworthathousandvectors/Experimental Setup: Space Debris
The videogame used for the purposes of data collection is based on Space Debris [22]. The action takes place within the confines of a single screen, with alien ships scrolling downwards. There are two types of enemy spaceships, the carrier (enemy 1) which is slow and can withstand more laser blasts, and a fighter (enemy 2) which is fast and easier to destroy. The player wins when he has successfully with stood the enemy ship waves for a predetermined time. The game environment is littered with floating asteroids which in their default state do not interact (i.e. collide) with any of the game spaceships. In order to do so, an asteroid has to be "energized" (hit by player weapon). Also floating are shield and life powerups which the user can use to replenish his ship's shield and remaining lives. The player's ship is equipped with a laser cannon which she can use to shoot alien ships. The laser canon is weak and about 45 successful shots are required to destroy an enemy ship (except for the boss which requires many more). The laser can also be used to "energize" an asteroid and guide it to destroy an enemy ship. A screenshot of the game is shown in Figure 1.
The participants included 10 players (74 game sessions, 10532 game states). Each participant had a 5minute trial period for becoming acquainted with Space Debris. Then, he/she was given a short questionnaire consisting of general gaming questions (about gaming experience, preferences etc.) and specific questions about Space Debris. The questionnaire is used as an initial resource to categorize the gaming style of the player into one of four categories: Novice (a player with little gaming experience and playing Space Debris without any particular style), tactical (a player keen on playing strategy or adventure games and when playing Space Debris makes wise use of the laser and powerups), aggressive (a player keen on action games and when playing Space Debris fires constantly without frequent use of the powerups), and defensive (a player keen on puzzle and internet games and when playing Space Debris does not fire or tries to avoid the enemies in order not to be killed). Afterwards, the participants played the game for 10 minutes, with the presence of a domain expert, who witnessed the player's gaming style and accepted or disputed the questionnaire's categorization. 29% of the game states belong to the novice class, 42% to the tactical, 19% to the aggressive and 10% to the defensive class. The class of a game state is given the value of the playing style a player exhibits on the particular game session.
"Traditional" data collection
For comparative analysis purposes, a first experiment was run using a dataset that is formed by a set of highlevel features used traditionally in player modeling. Each featurevalue vector represents an entire game session (i.e. 74 vectors in total), and the features (22 in number) denote
 The outcome of the game session (win/lose).
 The session score value.
 The mean number of available life and shield upgrades during the session.
 The total number of life and shield upgrades performed.
 The minimum and maximum horizontal and vertical position of the player during the session.
 The total number of lasers fired by the player and by the enemies during the session.
 The total number of enemies killed by lasers and asteroids.
 The total number of enemies escaped.
 The total number of enemies very close to the player.
 The total number of enemies close to the player.
 The total number of enemies on screen.
 The total number of asteroids hit.
 The total number of visible asteroids.
Classification tests were run with several stateoftheart classifiers (i.e. pruned decision trees, random forests and Support Vector Machines) on the Weka machine learning workbench (https://www.cs.waikato.ac.nz/ml/weka/). Decision trees and random forests are used to demonstrate the features' impact on the gaming style, and the extent to which this knowledge is already known apriori. As can be seen from the derived decision tree (Figure 2), the borderline of the player's position, the number of performed life and shield upgrades and the game result are closely related to the gaming style, information that is not so surprising, but can be sensed to some extent beforehand. The confusion matrix is shown in Figure 3. In this matrix rows show the actual gaming style derived from the questionnaire and the observation process described earlier (e.g. there are 25 + 2 + 2 + 0 = 29 actual tactical game sessions), and the columns show the style predicted by the decision tree (i.e. of the 29 actual tactical games, 25 are correctly classified as tactical, 2 as aggressive, 2 as novice and 0 as defensive).
Figure 4 shows the confusion matrix when applying random forests and Support Vector Machines to the data (both learning schemata led to the same results). The metalearning random forest schema and the SVM classifier, more sophisticated than the simple decision tree classifier, both reach 100% accuracy, showing thereby the relevance of the chosen features for modeling the particular game. Random forests were run choosing randomly 12 features and forming 100 decision trees in every run [50]. Support Vector Machines were run using a first degree polynomial kernel function and the Sequential Minimal Optimization algorithm for training [51].
"Holistic" data collection
The game features for the holistic dataset and their value range are listed next:
 The number of enemies very close to the player (denoting imminent threat) (min: 0, max: 5).
 The number of enemies close to the player (denoting danger) (min: 0, max: 9).
 The total number of enemies on screen (min: 0, max:13).
 The number of player lasers fired (min: 0, max: 5).
 The number of enemy 1 lasers fired (min: 0, max: 10).
 The number of enemy 2 lasers fired (min: 0, max: 10).
 The horizontal (X) and vertical (Y) position of the player (minX: 462, maxX: 462, minY: 334, maxY: 334).
 The number of life upgrades performed (min: 0, max: 1).
 The number of shield upgrades performed (min: 0, max: 1).
 The number of hit asteroids (min: 0, max: 6).
 The number of visible asteroids (min: 0, max: 8).
 The number of hit enemy 1 ships (min: 0, max: 8).
 The number of hit enemy 2 ships (min: 0, max: 8).
 The score value (min: 0, max: 8000).
 The number of the player's available life upgrades (min: 0, max: 3).
 The number of shields available to the player (min: 0, max: 100).
As explained earlier, a word is a string representing the concatenation of the values of all aforementioned features for a given game state, and the collected data resulted in 8458 distinct words. The feature values are discretized into at most ten value levels of equal distance. Even after discretization, the resulting termdocument matrix consists of 98.4% zero elements, showing the sparseness problem.
"Grid" data collection
The game screen is viewed as an 8 × 11 grid. Table 1 shows the distinct states that a cell may be in. More than one of the distinct states may be combined to form the cell state. A word is the concatenation of the states of the 88 cells. The vocabulary consists of 10263 words, and, in this case, zero elements in the termdocument matrix reach 98.8%.
To lessen the sparseness problem, another dataset was created, by merging the 25 distinct cell states into ten (Table 2) and by reducing the granularity of the grid down to 4 × 5 = 20 cells. Vocabulary size drops down to 9694 words and the percentage of zero elements to 98.5%. The merging process inevitably leads to information loss, as different (though similar) game elements are treated as the same. All dataset files can be downloaded from http://di.ionio.gr/hilab.
LSA for classification
All aforementioned termdocument matrices are transposed so that rows indicate sessions and columns indicate game states. LSA is applied to all aforementioned termdocument matrices. Experiments were run using 20, 40 and 60 latent semantic dimensions. The resulting D matrices depict the transformation of the game sessions (rows) into the new semantic space (columns). Performing classification experiments with the LSA datasets is not expected to lead to better results. As is claimed in the literature [52], running classification experiments using a dataset with latent attributes (dimensions) usually drops classification performance due to the unsupervised manner in which the latent dataset has been derived. Figure 5 shows the best results (confusion matrix) achieved with a latent dataset with the unpruned decision tree classifier.
Figure 6 shows the confusion matrix when using random forests (parameters are the same like before), and Figure 7 is the SVMs confusion matrix with a fourth degree polynomial kernel function. The prediction weakness of the latent dataset is evident even with sophisticated learners.
LSA for clustering: Revealing underlying semantic information
In order to identify the significance of the application of LSA, a similarity detection process between game sessions needs to be performed, other than classification. Metrics like the cosine similarity are widely employed for this purpose in the literature. Herein kmeans clustering is performed to the termdocument matrices in order to group semantically similar game sessions together. Choosing the number of clusters to equal the number of gaming styles, i.e. four, enables classestoclusters evaluation. Classestoclusters evaluation results in a confusion matrix that depicts to what percent sessions of the same style (class) are grouped into the same cluster. The confusion matrix for all initial datasets (before applying LSA) is the same regardless of the dataset type (holistic, grid or merged grid) and of using tf or tfidf weights and is shown in Figure 8b. Rows indicate actual gaming styles and columns indicate the clusters these were assigned to. For comparison purposes, Figure 8a shows the corresponding results for the traditional dataset. The difference in clustering performance is significant, clearly due to the large number of features in the termdocument datasets.
The remaining matrices ((c)(t)) correspond to the LSAtransformed datasets. Though results are far from satisfactory when it comes to matching clusters to styles onetoone (as already explained, this is not the primary concern of the proposed methodology anyway), several interesting findings are noteworthy. When the number of latent semantic dimensions reaches 60, clustering performance is the same as when no LSA is performed, i.e. no semantic information is hidden in the remaining singular values of the initial diagonal matrix that is not already captured in the first 60 singular values. Decreasing the number of singular values to 40 has a positive effect on clustering performance in most cases, while 20 dimensions seem to be insufficient to capture as well the semantic properties of the game. Regarding the holistic vs. the grid and the merged grid approach, it seems that the more "aggregate" (i.e. holistic) the nature of the dataset, the better the clustering performance, while the opposite holds for the "distributed" datasets. Intuitively, the latter require large amounts of data in order to overcome the bigger sparseness problem they face. Raw term frequencies seem to outperform tfidf scores in the "distributed" datasets, but not in the holistic dataset, where the df of the words varies significantly more and thereby makes their different discriminative power evident.
The subjective nature of the true play style assignment, the limited data, as well as the desire to investigate the semantic modeling process further, led to tests with a smaller number of clusters, i.e. 2. In this case, classestoclusters evaluation is not meaningful (a percentage of correctly clustered sessions is not provided). Figure 9 shows the formed clusters in relation to the game styles. What is interesting to observe is that for most datasets the majority of members of the first cluster are aggressive and tactical players, and of the second cluster defensive and novice players. The two formed clusters are interesting and can be explained, as novice players usually tend to play defensively, with no pattern or offensive strategy. On the other hand, aggressive and tactical players share the same confidence and are eager to win. The grid data captures this knowledge, while it is mostly lost in the merged grid, a sign of the information loss mentioned earlier. LSA helped reveal these unknown semantic similarities with lowlevel features even better than the highlevel traditional dataset.
Another interesting finding is that the aggressive style seems to be the most difficult to model. While the other styles (especially the novice and defensive styles) show a more conforming behavior in the latent dimensions (almost no example has dimension values outside a specific limited range), some of the aggressive examples often show rebellious behavior and lie away from the majority of their coclustered examples, given a latent dimension. This is partly attributed to the small number of examples of the particular class, and partly to the tothelimit type of playing that aggressive players adopt, where borderline values of the latent features are often reached. This underlying semantics was not noticeable in the initial datasets but came to light with the LSA analysis.
For experimental purposes the DBScan [53] clusterer has also been applied, as it is claimed in the literature to cope well with noisy data. DBScan is densitybased and forms highdensity regions into clusters. Predefining the value of cluster radius and the minimum number of objects required to form a cluster, the algorithm does not need to know the number of clusters beforehand, unlike kmeans. Objects that end up not belonging to any cluster are considered as noise and remain unclustered. In our experiments the minimum number of objects was set to 8, as it was the number of objects belonging to the minority class (aggressive), and several radius values were experimented with.
The only interesting findings with DBScan were achieved with the traditional dataset (Figure 10). The higher the radius value, the fewer instances remain unclustered. A radius of 0.8 leads to the optimal confusion matrix and, interestingly, novice and defensive examples are mostly grouped together, something that kmeans was able to reveal in the latent space only. Here this knowledge is derived from the traditional dataset; However, 17 instances (i.e. 23%) are left unclustered. The initial and the latent dataset lead to instances forming one cluster only, regardless of the radius size. These datasets are more complex, and multidensity, and DBScan neglects entire clusters and assigns them as noise [54].
Latent dirichlet allocation
Experiments with LDA were run on the Matlab Topic Modeling toolbox^{b} that performs LDA with Gibbs sampling. The number of topics (T) was chosen to be 4 (the number of play styles) and 2 (for comparison purposes to LSA). According to the parameter default values suggested by the toolbox, the number of iterations was set to 500, the theta (topic distribution) and phi (topicword distribution) hyperparameters on the Dirichlet priors were set to 50/T and 200/vocabularysize respectively.
^{b}http://psiexp.ss.uci.edu/research/programs_data/toolbox.htmFigure 11 shows the topic assignments of the game sessions for the initial grid and holistic datasets. For exploratory purposes, in matrices (a) and (d), the knowledge extracted from the LSA experiments is taken into account and results are depicted when only two styles are considered: Aggressivetactical and defensivenovice.
Supplementary Figure 1 and Supplementary Figure 2 show the percentage of words that belong to each of the four topics (T1T4) for every game session (session ids 174 are depicted in the xaxis) for the holistic and for the grid data respectively. As can be seen, for most sessions, differences between their wordtopic distributions are not immense, sometimes they are not even noticeable, leading to the 'unclear' confusion matrices of Figure 11.
The dashed lines separate sessions between different players. At the player level, all players can be modeled using one or at most two topics, i.e. interestingly the vast majority of their sessions consist of words that belong to one or two topics at the most.
The increased sparseness in the data makes it hard for a probabilistic modeling algorithm like LDA to identify further sensemaking concepts within the data in such a large feature space. Performance limitations of the Dirichletdistribution based traditional LDA technique on sparse data has also been discussed elsewhere [55]. Furthermore, probabilistic modeling approaches usually rely on more data for more confident results.
Conclusion
Through a set of IRinspired lowlevel features, LSAbased dimensionality reduction techniques and also a probabilistic modeling approach (Latent Dirichlet Allocation), the semantic domain of an action game is modeled, revealing dependencies and relations that are not easily detectable through traditional aggregate highlevel feature vector representations.
The future research prospects regarding the detection of hidden semantic knowledge are significant. A larger dataset with more players, the inclusion of timeand causalityrelated features, and further exploration of the obtained model (e.g. word similarity apart from session similarity researched herein) can lead to the detection of unknown knowledge that may contribute significantly to intelligent playercentered game design. It can also ensure performance of higher quality for the probabilistic modeling approaches.
References

Smith A, Lewis C, Hullett K, et al. (2011) An inclusive view of player modeling. Proceedings of the 6^{th} international conference on foundations of digital Games, France, 301303.

Hooshyar D, Yousefi M, Lim H (2018) Datadriven approaches to game player modeling: A systematic literature review. ACM Computing Surveys 50.

Thue D, Bulitko V, Spetch M, et al. (2007) Interactive storytelling: A player modeling approach. Proceedings of the 3^{rd} conference on AI and interactive digital entertainment, 4348.

Thurau C, Bauckhage C (2010) Analyzing the evolution of social groups in world of warcraft. Proceedings of IEEE conference on computational intelligence and games, Copenhagen, 170177.

Roberts D, Riedl M, Isbell C (2007) Opportunities for machine learning to impact interactive narrative. Proceedings of the NIPS workshop on machine learning and games, Malaga.

Geisler B (2002) An empirical study of machine learning algorithms applied to modeling player behavior in a first person shooter video game. University of wisconsinMadison.

Drachen A, Thurau C, Sifa R, et al. (2013) A comparison of methods for player clustering via behavioral telemetry. Foundations of digital games.

Bellman RE (1965) On the application of dynamic programming to the determination of optimal play in chess and checkers. Proc Natl Acad Sci U S A 53: 244246.

Missura O, Gaertner T (2009) Player modeling for intelligent difficulty adjustments. Proceedings of the 12^{th} international conference on discovery science, 197211.

He S, Du J, Chen H, et al. (2008) Strategybased player modeling during interactive entertainment sessions by using Bayesian classification. 4^{th} international conference on natural computation china, 255261.

Thawonmas R, Ho J (2007) Classification of online game players using action transition probability and Kullback Leibler entropy. JACIII 11: 319326.

Drachen A, Canossa A, Yannakakis G (2009) Player modeling using SelfOrganization in tomb raider: Underworld. Proceedings of the 5^{th} international conference on computational intelligence and games, Italy, 18.

Kang Y, Tan A (2010) Learning personal agents with adaptive player modeling in virtual worlds. Proceedings of the IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology, Toronto, Canada, 173180.

Shahine G, Banerjee B (2007) Player modeling using knowledge transfer. Proceedings of EUROSIS GAMEONNA conference, Gainesville, FL, 8289.

Yannakakis GN, Maragoudakis M, Hallam J (2009) Preference learning for cognitive modeling: A case study on entertainment preferences. IEEE 39: 11651175.

Bisson F, Larochelle H, Kabanza F (2015) Using a recursive neural network to learn an agent's decision model for plan recognition. Proceedings of the 25th international joint conference on artificial intelligence, 918924.

Luo L, Yin H, Cai W, et al. (2016) Design and evaluation of a datadriven scenario generation framework for gamebased training. IEEE 9: 213226.

Min W, Baikadi A, Mott B, et al. (2016) A generalized multidimensional evaluation framework for player goal recognition. Proceedings of the 12^{th} annual AAAI conference on artificial intelligence and interactive digital entertainment.

Tamassia M, Raffe W, Sifa R, et al. (2016). Predicting player churn in destiny: A hidden Markov models approach to predicting player departure in a major online game. Conference on computational intelligence and games IEEE, 18.

Falakmasir M, GonzálezBrenes JP, Gordon GJ, et al. (2016) A datadriven approach for inferring student proficiency from game activity logs. Proceedings of the 3^{rd} ACM conference on learning @ Scale, 341349.

Gao C, Shen H, Ali Babar M (2016) Concealing jitter in multiplayer online games through predictive behaviour modeling. IEEE 20^{th} international conference on computer supported cooperative work in design, 6267.

Anagnostou K, Maragoudakis M (2009) Data mining for player modeling in videogames. IEEE, Greece.

Cowley B, Kosunen I, Lankoski P, et al. (2014) Experience assessment and design in the analysis of gameplay. Simulation & Gaming 45: 4169.

Gow J, Baumgarten R, Cairns P, et al. (2012) Unsupervised modeling of player style with LDA. IEEE 4: 152166.

Lemaire B (1998) Models of highdimensional semantic spaces. Proceedings of the 4th international workshop on multistrategy learning.

Haley DT, Thomas P, de Roeck A, et al. (2005) A research taxonomy for latent semantic analysisbased educational applications. Proceedings of the conference on recent Advances in natural language processing, 575579.

Graesser AC, Penumatsa P, Ventura M, et al. (2007) Using LSA in Autotutor: Learning through mixedinitiative dialogue in natural language. In: T Landauer, D McNamara, S Dennis, et al. Handbook of Latent Semantic Analysis. Lawrence Erlbaum Associates Publishers, Mahwah, US, 243262.

McCarthy PM, Briner S, Rus V, et al. (2007) Textual signatures: Identifying texttypes using latent semantic analysis to measure the cohesion of text structures. In: Anne Kao, Stephen R Poteet, Natural language processing and text mining. Springer, London, 107122.

Steinberger J, Jezek K (2004) Using latent semantic analysis in text summarization and summary evaluation. Proceedings of the conference on information systems, implementation and modeling, 93100.

Nakov P, Valchanova E, Angelova G (2003) Towards deeper understanding of the LSA performance. proceedings of the conference on recent advances in natural language processing.

Zampa V, Lemaire B (2002) Latent semantic analysis for user modeling. JIIS 18: 1530.

Quesada JF, Kintsch W, Gomez E (2001) A computational theory of complex problem solving using the vector space model (part I): Latent semantic analysis, through the path of thousands of ants. Cognitive research with microworlds, Spain, 117131.

Done B, Khatri P, Done A, et al. (2010) Predicting novel human gene ontology annotations using semantic analysis. IEEE/ACM Trans comput biol bioinform 7: 9199.

Dong Q, Wang X, Lin L (2006) Application of latent semantic analysis to protein remote homology detection. Bioinformatics 22: 285290.

Ganapathiraju M, Balakrishnan N, Reddy R, et al. (2005) Computational biology and language. Ambient Intelligence for Scientific Discovery, 2547.

van Oostendorp H, Juvina I (2007) Using a cognitive model to generate web navigation support. IJHCS 65: 887897.

Hofmann T (2004) Latent semantic models for collaborative filtering. ACM Transactions on Information Systems 22: 89115.

Basili R, Petitti R, Saracino D (2007) LSAbased automatic acquisition of semantic image descriptions. Proceedings of the Conference on semantics and digital media technologies, LNCS, 4816: 4155.

Griffiths T, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci U S A 101: 52285235.

Weinshall D, Levi G, Hanukaev D (2013) LDA Topic model with soft assignment of descriptors to words. Proceedings of the 30th international conference on machine learning 28: 711719.

Krestel R, Fankhauser P, Nejdl W (2009) Latent dirichlet allocation for tag recommendation. Proceedings of the 3^{rd} ACM conference on recommender systems, 6168.

Bhattacharya I, Getoor L (2006) A latent dirichlet model for unsupervised entity resolution. Proceedings of SIAM conference on data mining 4758.

Deepak NA, Hariharan R, Sinha UN (2013) Clusterbased human saction recognition using latent dirichlet allocation. International conference on circuits, controls and communications, 14.

Bro I, Siklosi D, Szabo J, et al. (2009) Linked latent dirichlet allocation in web spam filtering. Proceedings of the 5th ACM international workshop on adversarial information retrieval on the web 3740.

Tang H, Shen L, Qi Y, et al. (2013) A multiscale latent dirichlet allocation model for objectoriented clustering of VHR panchromatic satellite images. Geoscience and Remote sensing 51: 16801692.

Aubert AH, Tavenard R, Emonet R, et al. (2013) Clustering flood events from water quality time series using latent dirichlet allocation model. Water Resources Research 49: 81878199.

Grant S, Cordy JR, Skillicorn DB (2013) Using heuristics to estimate an appropriate number of latent topics in source code analysis. Science of Computer Programming 78: 16631678.

Landauer T, Foltz P, Laham D (1998) An introduction to latent semantic analysis. Discourse Processes 25: 259284.

Blei DM, Andrew Ng, Jordan MI (2003) Latent dirichlet allocation. JMLR 3: 9931022.

Breiman L (2001) Random forests. Machine Learning 45: 532.

Platt J (1998) Fast training of support vector machines using sequential minimal optimization. In: B Schoelkopf, C Burges, A Smola, Advances in Kernel MethodsSupport Vector Learning.

Liu T, Chen Z, Zhang B, et al. (2004) Improving text classification using local latent semantic indexing. International Conference on Data Mining 162169.

Ester M, Kriegel HP, Sander J, et al. (1996) A densitybased algorithm for discovering clusters in large spatial databases with noise. Proceedings of the second international conference on knowledge discovery and data mining, 226231.

Ahmed R (2011) New densitybased clustering technique. Computer Engineering Department, The Islamic University of Gaza.

Eisenstein J, Ahmed A, Xing E (2011) Sparse additive generative models of text. Proceedings of the 28^{th} international conference on machine learning, 10411048.