03/10/2010 Wednesday

eEducation, eBusiness & eArts

Home  |  Science Studio  |  eBusiness  |  Art Studio  |  Services  |  Company Info  |  ~xmwang  
Physics Lab  |  Computer Science  |  Mathematical Physics  |  Order  | 
Computer Science

Math, Physics and Information Retrieval (IR)


Author: Dr. Xing M (Sherman) Wang

How email to the author?
Subject line: About the articles on your web site
Email address: swang (at) shermanlab (dot) com,
or from arxiv.org if a member

1: Dirac Notation, Fock Space and Riemann Metric Tensor in IR Models

HTML;   PDF: Current (08/05/2007);   Archived

Abstract

Using Dirac Notation as a powerful tool, we investigate the three classical Information Retrieval (IR) models and some their extensions. We show that almost all such models can be described by vectors in Occupation Number Representations (ONR) of Fock spaces with various specifications on, e.g., occupation number, inner product or term-term interactions. As an important case of study, the basic formulas for Singular Value Decomposition (SVD) of Latent Semantic Indexing (LSI) Model are manipulated in terms of Dirac notation. And, based on SVD, a Riemannian metric tensor ,/i> is introduced, which not only can be used to calculate the relevance of documents to a query, but also may be used to measure the closeness of documents in data clustering.


1: Probability Bracket Notation, Probability Vectors, Markov Chains and Stochestic Processes

PDF: Current (07/17/2007);   Archived

Abstract

Dirac notation has been widely used for vectors in Hilbert spaces of Quantum Theories. It now has also been introduced to Information Retrieval. In this paper, we propose a new set of symbols, the Probability Bracket Notation (PBN), for probability theories. We define new symbols like probability bra (p-bra), p-ket, p-bracket, sample base, unit operator, state ket and more as their counterparts in Dirac notation, which we refer as Vector Bracket Notation (VBN). By applying PBN to represent fundamental definitions and theorems for discrete and continuous random variables, we show that PBN could play the same role in probability sample space as Dirac notation in Hilbert space. We also find that there is a close relation between our probability state kets and probability vectors in Markov chains, which are invlived in data clustering like Diffusion Maps .We summarize the similarities and differences between PBN and VBN in the two tables of Appendix A.


2: Induced Hilbert Space, Markov Chain, Diffusion Map and Fock Space in Thermophysics

PDF: Current (04/08/2007);   Archived

Abstract

In this article, we continue to explore Probability Bracket Notation (PBN), proposed in our previous article. Using both Dirac vector bracket notation (VBN) and PBN, we define induced Hilbert space and induced sample space, and propose that there exists an equivalence relation between a Hilbert space and a probability sample space constructed from the same base observable(s). Then we investigate Markov transition matrices and their eigenvectors to make diffusion maps with two examples: a simple graph theory example, to serve as a prototype of bidirectional transition operator; a famous text document example in IR literature, to serve as a tutorial of diffusion map in text document space. We notice that, in both examples, the sample space of the Markov chain and the Hilbert space spanned by the eigenvectors of the transition matrix are not equivalent. At the end, we apply our PBN and equivalence proposal to Thermophysics by associating phase space with Hilbert space or Fock space of many-particle systems.



More to come, please visit us again!

Welcome use our Guest Book
Copyright © 2002-2008, Sherman Visual Lab