www.Inmagic.com    Inmagic Forums    Inmagic Forums  Hop To Forum Categories  Scripting    Regular Expression to extract quotes
Go
New
Find
Notify
Tools
Reply
  
-star Rating Rate It!  Login/Join 
Posted
I have a form that I'm using to extract quotes from news articles. The quotes are dumped to another database. I have successfully got the script working, but I'm having some trouble with the regular expression I'm using. Right now the regular expression gives me the quote, i.e. the text between two " ", but what I'd like is the full paragraph the quote is in instead. Any ideas on how to change my regex so it gives me the paragraph? Likely it has to do with newline characters \n but I'm not all that great with regex. Here's the script as I have it:

function btnExtractQuotes_onClick() {

var pattern = /"([^"]*)"/g;
var text = Form.boxes("boxBody").content;
var headline = Form.boxes("boxHeadline").content;
var publication = Form.boxes("boxPublication").content;
var pubdate = Form.boxes("boxPubDate").content;
var crs = Application.activeTextbase.currentRecordset;

var ID = Form.boxes("boxID").content;

Application.message(ID);

var result;

while((result = pattern.exec(text)) !=null) {

var newRecords=Application.newRecordset("Quotes","L:\\DBTextWorks\\Quotes\\","");
if ( newRecords == null ) {
Application.message( "Failed to open the Quotes textbase." );
} else
{
newRecords.Open("");
newRecords.AddNew();
newRecords.Fields.Item("QUOTE").Value = result[0];
newRecords.Fields.Item("INDEX").Value = result.index;
newRecords.Fields.Item("ID_SOURCE").Value = ID;
newRecords.Fields.Item("PUBLICATION").Value = publication;
newRecords.Fields.Item("HEADLINE").Value = headline;
newRecords.Fields.Item("PUBDATE").Value = pubdate;
newRecords.Update();
newRecords.Close();
}
}
Application.message("Quoted successfully.");
}
 
Posts: 47 | Registered: Tue July 15 2003Reply With QuoteEdit or Delete MessageReport This Post
Posted Hide Post
Yeah, newlines.

Instead of /"([^"]*)"/g try /^.+/g

I tested on paragraphs in my text editor. Seemed to work good. The "." character matches everything except newlines so it's perfect for this situation.

You can test at http://www.rexv.org/.


Peter Tyrrell, MLIS
Senior Consultant
Andornot Consulting Inc.
http://www.andornot.com/about/developerblog
 
Posts: 179 | Location: Vancouver, BC, Canada | Registered: Thu September 20 2001Reply With QuoteEdit or Delete MessageReport This Post
 Previous Topic | Next Topic powered by eve community  
 

www.Inmagic.com    Inmagic Forums    Inmagic Forums  Hop To Forum Categories  Scripting    Regular Expression to extract quotes