|
|
|
|
Math, Physics and Information Retrieval (IR)
Author: Dr. Xing M (Sherman) Wang
| How email to the author? |
| Subject line: | About the articles on your web site |
| Email address: | swang (at) shermanlab (dot) com,
or from arxiv.org if a member |
|
|
|
|
|
1: Dirac Notation, Fock Space and Riemann Metric Tensor in IR Models
|
|
HTML;
PDF: Current (08/05/2007);
Archived
|
Abstract
|
|
Using Dirac Notation as a powerful tool, we investigate the three classical Information Retrieval (IR)
models and some their extensions. We show that almost all such models can be described by vectors
in Occupation Number Representations (ONR) of Fock spaces with various specifications on, e.g., occupation number,
inner product or term-term interactions. As an important case of study, the basic formulas for
Singular Value Decomposition (SVD) of Latent Semantic Indexing (LSI) Model are manipulated in terms of Dirac notation.
And, based on SVD, a Riemannian metric tensor ,/i> is introduced, which not only can be used to calculate the relevance of
documents to a query, but also may be used to measure the closeness of documents in data clustering.
|
|
|
|
|
1: Probability Bracket Notation, Probability Vectors, Markov Chains and Stochestic Processes
|
|
PDF: Current (07/17/2007);
Archived
|
Abstract
|
|
Dirac notation has been widely used for vectors in Hilbert spaces of Quantum Theories. It now has also been
introduced to Information Retrieval. In this paper, we propose a new set of symbols, the Probability Bracket
Notation (PBN), for probability theories. We define new symbols like probability bra (p-bra), p-ket, p-bracket,
sample base, unit operator, state ket and more as their counterparts in Dirac notation, which we refer as Vector
Bracket Notation (VBN). By applying PBN to represent fundamental definitions and theorems for discrete and continuous
random variables, we show that PBN could play the same role in probability sample space as Dirac notation in Hilbert space.
We also find that there is a close relation between our probability state kets and probability vectors in Markov chains,
which are invlived in data clustering like Diffusion Maps .We summarize the similarities and differences between PBN
and VBN in the two tables of Appendix A.
|
|
|
|
|
2: Induced Hilbert Space, Markov Chain, Diffusion Map and Fock Space in Thermophysics
|
|
PDF: Current (04/08/2007);
Archived
|
Abstract
|
|
In this article, we continue to explore Probability Bracket Notation (PBN), proposed in our previous article.
Using both Dirac vector bracket notation (VBN) and PBN, we define induced Hilbert space and induced sample space,
and propose that there exists an equivalence relation between a Hilbert space and a probability sample space constructed
from the same base observable(s). Then we investigate Markov transition matrices and their eigenvectors to make diffusion
maps with two examples: a simple graph theory example, to serve as a prototype of bidirectional transition operator;
a famous text document example in IR literature, to serve as a tutorial of diffusion map in text document space.
We notice that, in both examples, the sample space of the Markov chain and the Hilbert space spanned by the
eigenvectors of the transition matrix are not equivalent. At the end, we apply our PBN and equivalence proposal
to Thermophysics by associating phase space with Hilbert space or Fock space of many-particle systems.
|
|
|
|