Big Data - Apache Pig

Back to Course

Lesson Description

Lession - #468 Apache pig FILTER Operator

Pig permits you to eliminate undesirable records in light of a condition. The Filter usefulness is like the WHERE condition in SQL. The FILTER administrator in pig is utilized to eliminate undesirable records from the information document.

The FILTER operator is utilized to choose the required tuples from a relation based on a condition.

The syntax of FILTER operator is shown below:

 <new relation> = FILTER <relation> BY <condition>


Assume that we have a file named student_details.txt in the HDFS directory /pig_data/ as shown below.

1 aaa 74385738 delhi
2 bbb 76349948 mumbai
3 ddd 87493589 pune
4 ggg 74824727 goa

Writing Filter Operator

filter_data = FILTER student_details BY city == 'pune';

3 ddd 87493589 pune