Understanding Event Classification in Heavy-Ion Collisions

How to classify different events in heavy-ion collisions Qinghui Zhang

Why we need to classification? • Events in the heavy-ion collisions may be • different. • Some events may undergo phase transition, • but some events may not undergo phase • transition.

What should we careful about eventsclassification? • We need to know the feature which can be • used to classify the different between • events • We need to know the feature value for QGP phase and non QGP phase

What we will do in event classification? • (1): verification: • For a given event and a claim, the system will say Yes or No • (2): Identification: • For a given event, the system will tell which class this events belongs by using the database the system have.

The complex of heavy-ion collisons • (1): Too many particles in each event • (2): It is difficult to choose a valuable measure. • (3): Do not know the detail value of measure for QGP or non-QGP. We can not observe QGP directly!

The typical picture of a event

Support Vector Machines • Three main ideas: • Define what an optimal hyperplane is (in way that can be identified in a computationally efficient way): maximize margin • Extend the above definition for non-linearly separable problems: have a penalty term for misclassifications • Map data to high dimensional space where it is easier to classify with linear decision surfaces: reformulate problem so that data is mapped implicitly to this space

How to classify different events? • Suppose we can select some features of each event! • Suppose we know the features value for different classes. • How to classify them? We need a method to classify different events.

Which Separating Hyperplane to Use? Var1 Var2

Maximizing the Margin Var1 IDEA 1: Select the separating hyperplane that maximizes the margin! Margin Width Margin Width Var2

Support Vectors Var1 Support Vectors Margin Width Var2

Setting Up the Optimization Problem Var1 The width of the margin is: So, the problem is: Var2

Setting Up the Optimization Problem Var1 There is a scale and unit for data so that k=1. Then problem becomes: Var2

Setting Up the Optimization Problem • If class 1 corresponds to 1 and class 2 corresponds to -1, we can rewrite • as • So the problem becomes: or

Linear, Hard-Margin SVM Formulation • Find w,b that solves • there is a unique global minimum value • There is also a unique weight and b value that provides the minimum • Non-solvable if the data is not linearly separable • Quadratic Programming

Non-Linearly Separable Data Var1 Introduce slack variables Allow some instances to fall within the margin, but penalize them Var2

Formulating the Optimization Problem Constraints becomes : Objective function penalizes for misclassified instances and those within the margin C trades-off margin width and misclassifications Var1 Var2

Linear, Soft-Margin SVMs • Algorithm tries to maintain i to zero while maximizing margin • Notice: algorithm does not minimize the number of misclassifications , but the sum of distances from the margin hyperplanes • As C0, we get the hard-margin solution

Var1 i Var2 Robustness of Soft vs Hard Margin SVMs Var1 Var2 Hard Margin SVN Soft Margin SVN

Soft vs Hard Margin SVM • Soft-Margin always have a solution(C is not zero) • Soft-Margin is more robust to outliers • Hard-Margin does not require to guess the cost parameter (requires no parameters at all)(C is zero!!!)

Var1 Var2 Disadvantages of Linear Decision Surfaces

Var1 Var2 Advantages of Non-Linear Surfaces

Linear Classifiers in High-Dimensional Spaces Constructed Feature 2 Var1 Var2 Constructed Feature 1 Find function (x) to map to a different space

Mapping Data to a High-Dimensional Space • Find function (x) to map to a different space, then SVM formulation becomes: • Data appear as (x), weights w are now weights in the new space • Explicit mapping expensive if (x) is very high dimensional • Solving the problem without explicitly mapping the data is desirable

The Dual of the SVM Formulation • Original SVM formulation • n inequality constraints • n positivity constraints • n number of  variables • The (Wolfe) dual of this problem • one equality constraint • n positivity constraints • n number of  variables (Lagrange multipliers) • Objective function more complicated • NOTICE: Data only appear as (xi)  (xj)

The Kernel Trick • (xi)  (xj): means, map data into new space, then take the inner product of the new vectors • We can find a function such that: K(xi  xj) = (xi)  (xj), i.e., the image of the inner product of the data is the inner product of the images of the data • Then, we do not need to explicitly map the data into the high-dimensional space to solve the optimization problem (for training) • How do we classify without explicitly mapping the new instances? Turns out

Examples of Kernels • Assume we measure two quantities, e.g. expression level of genes TrkC and SonicHedghog (SH) and we use the mapping: • Consider the function: • We can verify that:

Non-linear SVMs: Feature spaces • General idea: the original feature space can always be mapped to some higher-dimensional feature space where the training set is separable: Φ: x→φ(x)

The Mercer Condition • The SVM dual formulation requires calculation K(xi , xj) for each pair of training instances. The array Gij = K(xi , xj) is called the Gram matrix • There is a feature space (x) when the Kernel is such that Gis always semi-positive definite (Mercer condition)

Ising Model and Random Model We will use Ising Model and Random model in our analysis. The details of the model can be found in Phys. Rev. C64,054904 (2001). In short, we simulate Ising model on two dimension Space (288*288 Lattices), Then we corresponds Each site a “hadron “ in the heavy collisions. For Random model, the idea is the same as Ising except that there is no cluster as used in Ising model.

Typical picture of Ising model and Random model Ising Model Random

To classify Random and Ising model • Choose feature. Cast each event to a point In a 72*72 dimension space

Make it difficult? • Rescale the value of the Random model such that the average value of each component of Random model is the same as Ising model

Need to choose new feauture! • Choose the average value for each event • Choose the second moment of the hadron density.

If we decreases the number of lattice

Decreases the size of lattice

Cluster structure in the event distribution in high dimension space

Cluster structure in the events distribution in high dimension space

Cluster structure in events distribution in two dimension space

Conclusions: • We can classify different events by choosing features • We classify Random model and Ising model successfully • We can of course generate the above model to multi-class cases

Understanding Event Classification in Heavy-Ion Collisions

Understanding Event Classification in Heavy-Ion Collisions

Presentation Transcript

Scaling Properties in Heavy Ion Collisions

Flow in heavy ion collisions

Conical Correlations in Heavy-Ion Collisions

Hyperon Polarization in Heavy ion Collisions

Jet Reconstruction in Heavy Ion Collisions

Collective Flow in Heavy-Ion Collisions

Resonances in Heavy Ion Collisions

Strangeness Balance in Heavy-ion Collisions

Open Heavy Flavor in Heavy Ion Collisions

Quarkonia in heavy ion collisions @ LHC

Heavy Quark Diffusion in Heavy Ion Collisions

Conical Emission in Heavy-Ion Collisions

Heavy Flavor Dynamics in Relativistic Heavy-ion Collisions

Femtoscopy in heavy ion collisions

Bulk Dynamics in Heavy Ion Collisions

Introduction to relativistic heavy ion collisions

Electromagnetic Radiation in Heavy Ion Collisions

Quarkonia physics in Heavy Ion Collisions

Photon Measurements in Heavy Ion Collisions

Dihadron Correlations in Heavy Ion Collisions

Flow in heavy ion collisions

Resonance Dynamics in Heavy Ion Collisions