Step 5 : Now sort data based on product_price in order. sortedPriceProducts=nonempty_lines.map(lambdaline:(float(line.split(",")[4]),line.split(",")[2])).sortByKey()
for line in sortedPriceProducts.collect(): print(line)
Step 6 : Now sort data based on product_price in descending order. sortedPriceProducts=nonempty_lines.map(lambda line: (float(line.split(",")[4]),line.split(",")[2])).sortByKey(False)
for line in sortedPriceProducts.collect(): print(line)
Step 7 : Get highest price products name. sortedPriceProducts=nonemptyJines.map(lambda line : (float(line.split(",")[4]),line-split(,,,,,)[2]))-sortByKey(False).take(1)
print(sortedPriceProducts)
Step 8 : Now sort data based on product_price as well as product_id in descending order.
#Dont forget to cast string #Tuple as key ((price,id),name)
sortedPriceProducts=nonemptyJines.map(lambda line : ((float(line print(sortedPriceProducts)
Step 9 : Now sort data based on product_price as well as product_id in descending order, using top() function.
There is a parent organization called "ABC Group Inc", which has two child companies named Tech Inc and MPTech.
Both companies employee information is given in two separate text file as below. Please do the following activity for employee details.
Tech Inc.txt
1,Alok,Hyderabad
2,Krish,Hongkong
3,Jyoti,Mumbai
4,Atul,Banglore
5,Ishan,Gurgaon
MPTech.txt
6,John,Newyork
7,alp2004,California
8,tellme,Mumbai
9,Gagan21,Pune
10,Mukesh,Chennai
1. Which command will you use to check all the available command line options on HDFS and How will you get the Help for individual command.
2. Create a new Empty Directory named Employee using Command line. And also create an empty file named in it Techinc.txt
3. Load both companies Employee data in Employee directory (How to override existing file in HDFS).
4. Merge both the Employees data in a Single tile called MergedEmployee.txt, merged tiles should have new line character at the end of each file content.
5. Upload merged file on HDFS and change the file permission on HDFS merged file, so that owner and group member can read and write, other user can read the file.
6. Write a command to export the individual file as well as entire directory from HDFS to local file System.
Answer is in the explanation below.
Reference / correct answer:
See the explanation for Step by Step Solution and configuration.
Solution :
Step 1 : Check All Available command hdfs dfs
Step 2 : Get help on Individual command hdfs dfs -help get
Step 3 : Create a directory in HDFS using named Employee and create a Dummy file in it called e.g. Techinc.txt hdfs dfs -mkdir Employee
Now create an emplty file in Employee directory using Hue.
Step 4 : Create a directory on Local file System and then Create two files, with the given data in problems.
Step 5 : Now we have an existing directory with content in it, now using HDFS command line , overrid this existing Employee directory. While copying these files from local file System to HDFS. cd /home/cloudera/Desktop/ hdfs dfs -put -f Employee
Step 6 : Check All files in directory copied successfully hdfs dfs -Is Employee
Step 7 : Now merge all the files in Employee directory, hdfs dfs -getmerge -nl Employee MergedEmployee.txt
Step 8 : Check the content of the file. cat MergedEmployee.txt
Step 9 : Copy merged file in Employeed directory from local file ssytem to HDFS. hdfs dfs -put MergedEmployee.txt Employee/