Normal view MARC view ISBD view

Philippine Computing Journal.

Material type: Text

TextSeries: ; Philippine Computing Journal, Vol. 14, No.1, August 2019Publication details: Philippines : Computing Society of the Philippines, 2019Description: 43 pages : illustrations ; 28 cmISSN:

1908-1995

Subject(s):

INFORMATION TECHNOLOGY

Contents:

Hate Speech in Philippine Election-Related Tweets: Automatic Detection and Classification Using Natural Language -- Learning about Good Healthcare Practices through Interactive Storytelling with Virtual Peers -- Extracting Events from Fairy Tales for Story Summarization -- Measuring Transcript Relevance and Certainty through Sentence Classification and Semantic Similarity Analysis.

Summary: [Article Title: Hate Speech in Philippine Election-Related Tweets: Automatic Detection and Classification Using Natural Language / Niel Cabasag, Vicente Raphael Chan, Sean Lim, Mark Gonzales, and Charibeth Cheng, p.1-15] Abstract: Social networking sites have opened avenues for the expression of disparaging and antagonistic sentiments, proliferating hate speech. While technologies have been devised to address this problem, systems contextualized in the Philippine cyberspace are essential since hate speech is deeply tied to the context of a locale. This research sought to address this need by developing a model capable of automating hate speech detection. Tweets posted during the 2016 Philippine electoral campaign was labeled as either hate-or non-hate-containing and annotated with the target(s) of hate. Simple language-independent features, namely, term frequency-inverse document frequency(TF-IDF), term occurrence(TO), and their combination, were extracted. For binary classification, logistic regression using TF-IDF+TO and with hashtag segmentation performed best (F1 = 77.47%), outperforming the keyword-matching rule-based classifier by around 6%. The feedforward neural network failed to outperform the best logistic regression model entirely but scored competitively and used fewer features. For multilabel classification, perceptron using TF-IDF+TO and with hashtag segmentation performed best (micro-F1 = 67.80%, macro-F1 = 61.86%), outperforming the rule-based classifier by 15.71% and 7.25% macro-and micro-F1,respectively. The main contribution of this paper is a comparative investigation of different classifiers using simple language-independent features for detecting and classifying political hate speech from the Philippines;[Article Title: Learning about Good Healthcare Practices through Interactive Storytelling with Virtual Peers / Raisa Lee, Janine Regala, Angeline Tan, Janine Tan, and Ethel Ong, p.15-24] Abstract: Storytelling can be used to raise awareness on healthcare practices among young children. Combined with interactive environments, children can be given a virtual simulated world to learn about the effects of practicing good healthcare habits with the guidance of virtual peers. In this paper, we present Sarah, a virtual peer that shares interactive healthcare stories centering on events that may lead to symptoms and recovery from common childhood ailments. Armed with a collection of domain knowledge about healthcare facts, the virtual peerutilizes story planning and dialogue generation strategies to facilitate a text-based storytelling session with children age 7to10 years old. Children participate in the storytelling by using free-form input text to respond to questions posed by Sarah. Results from end-user validation showed that children prefer to interact with virtual peers that exudea more friendly persona, and that can generate relevant responses aligned to their story text;[Article Title: Extracting Events from Fairy Tales for Story Summarization / Bianca Trish Adolfo and Ethel Ong, p.25-33] Abstract: Automatic text summarization is mostly used to provide quick access to relevant information from a huge volume of documents and news, especially from online document search facilities. Summarizing fictional stories, however, may pose some challenges for the machine since important information can appear in unexpected places in the text. A prerequisite for generating story summaries is a computational model that captures the events needed to recreate the story while ignoring irrelevant details without losing the central idea of the story writer. In this paper, we describe our approach in using extractive summarization techniques to identify relevant events from a corpus of five (5)fairy tales. Results from comparing the events found in the computer-generated summary against a human-annotated events as the reference text showed that the event extraction algorithm has a precision of 62.33%representing the relevant events that were extracted, a recall of 42.25%representing the percentage of total relevant results that were retrieved, and an f-measure of 50.36%that specifies the accuracy of the test. Problems with the varying sentence structure led to incorrect and missing extraction instances for different event details. The extraction algorithm also encountered difficulty when dealing with clauses;[Article Title: Measuring Transcript Relevance and Certainty through Sentence Classification and Semantic Similarity Analysis / Cyrez Ronquilo and Reginald Recario, p.34-43] Abstract: This paper presents a study that attempts to measure how factual a given statement is and how much it claims is relevant. The study is divided into two major parts: Sentence Classification and Semantic Similarity Analysis. For sentence classification, supervised classifiers were trained: Support Vector Machine (SVM) with Radial Basis Function (RBF) Kernel, Logistic Regression (LR), and Artificial Neural Network (ANN). Tuning was done to the models to create different setups and find out which setup produces the best results. The dataset used for the models were 33 US Debate Transcripts. After dataset preprocessing, 22,611 sentences remained, and 63 features were extracted from the sentences. The Logistic Regression model was determined to be the most reliable model. The Support Vector Machine scored the highest Training Accuracy. However, the Logistic Regression model is the most balanced model, based on the five different evaluation metrics. For Semantic Similarity Analysis, news articles were extracted from 16 different satiric and reliable websites to assess an input statement's certainty. Triplet Extraction Approach was used to compare the input sentence to different articles to be able to provide a corresponding similarity score and to classify it as reliable, satiric, or unverified. Using a set data for testing, the Semantic Similarity Analysis scored 80% accuracy

Item type:

Serials

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings
Item type	Current library	Home library	Collection	Call number	Copy number	Status	Date due	Barcode
Serials	National University - Manila	LRC - Main Periodicals	Gen. Ed. - CCIT	Philippine Computing Journal, Vol. 14, No.1, August 2019 (Browse shelf(Opens below))	c.1	Available		PER000000937

Browsing LRC - Main shelves, Shelving location: Periodicals, Collection: Gen. Ed. - CCIT Close shelf browser (Hides shelf browser)

Previous	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	No cover image available	Next
Previous	Philippine Computing Journal, Volume 8, Issue 2, December 2013 c.2 Philippine Computing Journal	Philippine Computing Journal, Volume 8, Issue 2, December 2013 c.3 Philippine Computing Journal	Philippine Computing Journal, Volume 8, Issue 2, December 2013 c.4 Philippine Computing Journal	Philippine Computing Journal, Vol. 14, No.1, August 2019 Philippine Computing Journal.	Philippine Computing Journal, Vol. 14, No.2, December 2019 Philippine Computing Journal.	Philippine Computing Journal, Volume 9, Issue 1, August 2014 c.1 Philippine Computing Journal	Philippine Computing Journal, Vol. 11, No. 2, December 2016 Philippine Computing Journal.	Next

Includes index and bibliographical references.

[Article Title: Hate Speech in Philippine Election-Related Tweets: Automatic Detection and Classification Using Natural Language / Niel Cabasag, Vicente Raphael Chan, Sean Lim, Mark Gonzales, and Charibeth Cheng, p.1-15] Abstract: Social networking sites have opened avenues for the expression of disparaging and antagonistic sentiments, proliferating hate speech. While technologies have been devised to address this problem, systems contextualized in the Philippine cyberspace are essential since hate speech is deeply tied to the context of a locale. This research sought to address this need by developing a model capable of automating hate speech detection. Tweets posted during the 2016 Philippine electoral campaign was labeled as either hate-or non-hate-containing and annotated with the target(s) of hate. Simple language-independent features, namely, term frequency-inverse document frequency(TF-IDF), term occurrence(TO), and their combination, were extracted. For binary classification, logistic regression using TF-IDF+TO and with hashtag segmentation performed best (F1 = 77.47%), outperforming the keyword-matching rule-based classifier by around 6%. The feedforward neural network failed to outperform the best logistic regression model entirely but scored competitively and used fewer features. For multilabel classification, perceptron using TF-IDF+TO and with hashtag segmentation performed best (micro-F1 = 67.80%, macro-F1 = 61.86%), outperforming the rule-based classifier by 15.71% and 7.25% macro-and micro-F1,respectively. The main contribution of this paper is a comparative investigation of different classifiers using simple language-independent features for detecting and classifying political hate speech from the Philippines;[Article Title: Learning about Good Healthcare Practices through Interactive Storytelling with Virtual Peers / Raisa Lee, Janine Regala, Angeline Tan, Janine Tan, and Ethel Ong, p.15-24] Abstract: Storytelling can be used to raise awareness on healthcare practices among young children. Combined with interactive environments, children can be given a virtual simulated world to learn about the effects of practicing good healthcare habits with the guidance of virtual peers. In this paper, we present Sarah, a virtual peer that shares interactive healthcare stories centering on events that may lead to symptoms and recovery from common childhood ailments. Armed with a collection of domain knowledge about healthcare facts, the virtual peerutilizes story planning and dialogue generation strategies to facilitate a text-based storytelling session with children age 7to10 years old. Children participate in the storytelling by using free-form input text to respond to questions posed by Sarah. Results from end-user validation showed that children prefer to interact with virtual peers that exudea more friendly persona, and that can generate relevant responses aligned to their story text;[Article Title: Extracting Events from Fairy Tales for Story Summarization / Bianca Trish Adolfo and Ethel Ong, p.25-33] Abstract: Automatic text summarization is mostly used to provide quick access to relevant information from a huge volume of documents and news, especially from online document search facilities. Summarizing fictional stories, however, may pose some challenges for the machine since important information can appear in unexpected places in the text. A prerequisite for generating story summaries is a computational model that captures the events needed to recreate the story while ignoring irrelevant details without losing the central idea of the story writer. In this paper, we describe our approach in using extractive summarization techniques to identify relevant events from a corpus of five (5)fairy tales. Results from comparing the events found in the computer-generated summary against a human-annotated events as the reference text showed that the event extraction algorithm has a precision of 62.33%representing the relevant events that were extracted, a recall of 42.25%representing the percentage of total relevant results that were retrieved, and an f-measure of 50.36%that specifies the accuracy of the test. Problems with the varying sentence structure led to incorrect and missing extraction instances for different event details. The extraction algorithm also encountered difficulty when dealing with clauses;[Article Title: Measuring Transcript Relevance and Certainty through Sentence Classification and Semantic Similarity Analysis / Cyrez Ronquilo and Reginald Recario, p.34-43] Abstract: This paper presents a study that attempts to measure how factual a given statement is and how much it claims is relevant. The study is divided into two major parts: Sentence Classification and Semantic Similarity Analysis. For sentence classification, supervised classifiers were trained: Support Vector Machine (SVM) with Radial Basis Function (RBF) Kernel, Logistic Regression (LR), and Artificial Neural Network (ANN). Tuning was done to the models to create different setups and find out which setup produces the best results. The dataset used for the models were 33 US Debate Transcripts. After dataset preprocessing, 22,611 sentences remained, and 63 features were extracted from the sentences. The Logistic Regression model was determined to be the most reliable model. The Support Vector Machine scored the highest Training Accuracy. However, the Logistic Regression model is the most balanced model, based on the five different evaluation metrics. For Semantic Similarity Analysis, news articles were extracted from 16 different satiric and reliable websites to assess an input statement's certainty. Triplet Extraction Approach was used to compare the input sentence to different articles to be able to provide a corresponding similarity score and to classify it as reliable, satiric, or unverified. Using a set data for testing, the Semantic Similarity Analysis scored 80% accuracy

There are no comments on this title.

to post a comment.