Home | About | Install | FAQ | Screens | Tools | ||||||
Search | Extract | Q & A | Cluster | Dictionary | |||||||
Each e-mail is automatically sorted and placed in the most likely category or placed in a miscellaneous category. A category consists of a set of keywords and associated weights that describe the category. You can build a hierarchy of categories. Each e-mail is converted to a set of weighted keywords and compared with the keywords for every category. The closest (highest similarity) category is selected if it exceeds a threshold, otherwise, it is assigned to the miscellaneous category. | |||||||
All e-mails from a category can be viewed separately. E-mails can be manually routed to categories to build an accurate representation of a category. The weights of the terms describing a category can also be adjusted. Finally, when a category is sufficiently trained with a set of e-mails and manual tuning and appears to accurately reflect a category, it can be made static by stopping training. | |||||||
An E-flow function is included. This function tracks the flow of e-mail traffic. You can set reminders to send or receive e-mails. If you are expecting an e-mail, an E-flow entry can be used to verify if it has arrived or not. Likewise, if you have to send an e-mail, you can generate a task to remind you. The reminder will let you know if the task is complete or not. | |||||||
E-flow can be used for reminders to send greetings or to reply to an e-mail later. The status for a task can be complete or incomplete and is set automatically by comparing the task description with the to address and the contents of the e-mail. If the similarity exceeds a threshold and the date of the e-mail is in a date range, then the task is complete. | |||||||
Modules | |||||||
MailUtil - This module contains functions to display the e-mail header HTML for CGI scripts, save an e-mail in the database, categorize an e-mail, compute a centroid for an e-mail category, and other functions. The CGI scripts starting with the em_ prefix contain examples of E-Mail functions. | |||||||
Function Calls for save_email | |||||||
The save_email function
accepts a file consisting of many concatenated e-mails separated
by e-mail headers. The body and
headers for the e-mails are saved in a hash. An e-mail vector
is generated using the email_vector
function. This vector is compared with the centroids
for email categories. For each comparison, categories with
a similarity greater than a threshold are saved. Finally, the
e-mail is assigned the category with the highest similarity.
If the similarity with all categories is below the threshold,
then the e-mail is assigned to the miscellaneous category.
Finally, the e-mail network is updated to reflect the new
e-mail's from and to addresses.
| |||||||
Processing a received email | |||||||
Projects | |||||||
1. Modules to import e-mail data from sources into the e-mail
archive. Copying e-mails from a mail server, mail client, or
web based e-mail to the archive.
2. Improve accuracy of categorization - can manual setting of term weights for a category lead to a better representation of a category 3. Check E-flow. Does it track the flow (in and out) of e-mails ? | |||||||