Deriving Concept-Based User Profiles From Search Engine Logs


Deriving Concept-Based User Profiles
From Search Engine Logs
Abstract
                         User profiling is a fundamental component of any personalization applications. Most existing user profiling strategies are based on objects that users are interested in (i.e., positive preferences), but not the objects that users dislike (i.e., negative preferences). In this paper, we focus on search engine personalization and develop several concept-based user profiling methods that are based on both positive and negative preferences. We evaluate the proposed methods against our previously proposed personalized query clustering method. Experimental results show that profiles which capture and utilize both of the user’s positive and negative preferences perform the best. An important result from the experiments is that profiles with negative preferences can increase the separation between similar and dissimilar queries. The separation provides a clear threshold for an agglomerative clustering algorithm to terminate and improve the overall quality of the resulting query clusters.

Existing System:
                        Existing click through-based user profiling strategies can be categorized into document-based and concept based approaches. They both assume that user clicks can be used to infer users’ interests, although their inference methods and the outcomes of the inference are different. Document-based profiling methods try to estimate users’ document preferences (i.e., users are interested in some documents more than others) .On the other hand, concept based profiling methods aim to derive topics or concepts that users are highly interested in. These two approaches will be reviewed in. While there are document-based methods that consider both users’ positive and negative preferences, to the best of our knowledge, there are no concept-based methods that considered both positive and negative preferences in deriving user’s topical interests.
Problems:
  • personalized search is not implemented to display the relevant(user specific) results.
  •  Most existing user profiling strategies only consider documents that users are interested in (i.e., users’ positive preferences) but ignore documents that users dislike (i.e., users’ negative preferences).
  • short and ambiguous queries are unable to express the user’s precise needs. And same results for the same query are displaying regardless of the user’s real interest
Proposed System:
                      Personalized search is an important research area that aims to resolve the ambiguity of query terms. To increase the relevance of search results, personalized search engines create user profiles to capture the users’ personal preferences and as such identify the actual goal of the input query. Since users are usually reluctant to explicitly provide their preferences due to the extra manual effort involved, recent research has focused on the automatic learning of user preferences from users’ search histories or browsed documents and the development of personalized systems based on the learned user preferences. A good user profiling strategy is an essential and fundamental component in search engine personalization. We studied various user profiling strategies for search engine personalization, and observed the following problems in existing strategies. Most personalization methods focused on the creation of one single profile for a user and applied the same profile to all of the user’s queries. We believe that different queries from a user should be handled differently because a user’s preferences may vary across queries. For example, a user who prefers information about fruit on the query “orange” may prefer the information about Apple Computer for the query “apple.” Personalization strategies such as employed a single large user profile for each user in the personalization process.

Advantage:
  1. We extend the query-oriented, concept-based user profiling method proposed in to consider both users’ positive and negative preferences in building users profiles.
  2. We proposed six user profiling methods that exploit a user’s positive and negative preferences to produce profiles. (SVM-Decision making)
  3. RSVM to learn from concept preferences weighted concept vectors representing concept-based user profiles.
  4. We show that profiles which capture both the user’s positive and negative preferences perform best among all of the proposed methods. We also find that the query clusters obtained from our methods are very close to the optimal clusters.

Architecture

HARDWARE & SOFTWARE REQUIREMENTS:
HARDWARE REQUIREMENTS: 
·                     System                        :           Pentium IV 2.4 GHz.
·                     Hard Disk                   :           40 GB.
·                     Floppy Drive                :        1.44 Mb.
·                     Monitor                       :           15 VGA Color.
·                     Mouse                         :           Logitech.
·                     Ram                             :        512 MB.
  SOFTWARE REQUIREMENTS: 
·                     Operating system        :           Windows XP Professional.
·                     Coding Language       :           java(jdk1.6.0)
·                     Front    End                 :        Struts Framework
·                     Back End                     :        Oracle 10g
Modules Description
1.      Concept based user selection
2.      User log information
3.      Support identification based on the concept
4.      Weight generation
5.     Precision and Recall

Concept Based User Selection:
                  Concept-based user profiling strategies that are capable of deriving both of the users’ positive and negative preferences. The entire user profiling strategies is query-oriented, meaning that a profile is created for each of the user’s queries. The user profiling strategies are evaluated and compared with our previously proposed personalized query clustering method.
User Log information:
                   User profiling strategies can be broadly classified into two main approaches: document-based and concept-based approaches. Document-based user profiling methods aim at capturing users’ clicking and browsing behaviors. Users’ document preferences are first extracted from the click through data, and then, used to learn the user behavior model which is usually represented as a set of weighted features. On the other hand, concept-based user profiling methods aim at capturing users’ conceptual needs. Users’ browsed documents and search histories are automatically mapped into a set of topical categories. User profiles are created based on the users’ preferences on the extracted topical categories.
Support identification based on the concept:
                    Support to learn from concept preferences weighted concept vectors representing concept-based user profiles. The weights of the vector elements, which could be positive or negative, represent the interestingness (or un interestingness) of the user on the concepts. In, the weights that represent a user’s interests are all positive, meaning that the method can only capture User’s positive preferences.
Weight Generation:
                 To evaluate the proposed user profiling strategies and compare it with a baseline proposed in. We show that profiles which capture both the user’s positive and negative preferences perform best among all of the proposed methods. We also find that the query clusters obtained from our methods are very close to the optimal clusters.
Precision and Recall:
                   Optimal clusters to be the clusters obtained by the best termination strategies for initial clustering and community merging .The optimal clusters are compared to the standard clusters using standard precision and recall measures q is the input query, Q relevant is the set of queries that exists in the predefined cluster for q, and Q retrieved is the set of queries generated by the clustering algorithm. The precision and recall from all queries are averaged to plot the precision-recall figures, comparing the effectiveness of the user profiles.
Algorithms:  Personalized Agglomerative Clustering

1 comments:

  1. hi,... can u send me full project to my mail id..It may helpful for my academics... santhoshkumar547@gmail.com

    thank you in advance..

    ReplyDelete