发布于 ,更新于 

CDH安装和SPARK运行

1. CDH安装

2. SPARK运行

spark提交example程序
1
spark-submit --master yarn --deploy-mode=cluster  --executor-memory 512M --class  org.apache.spark.examples.SparkPi /opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/spark/examples/jars/spark-examples_2.11-2.4.0-cdh6.3.2.jar 10
报错:Exception in thread main org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode=“/user":hdfs:supergroup:drwxr-xr-x

解决方案:切换hdfs用户: su -s /bin/bash hdfs

报错:yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all

解决方案:添加集群模式 –deploy-mode=cluster

报错:Exception in thread "main" java.lang.IllegalArgumentException: Required AM memory (1024+384 MB) is above the max threshold (1024 MB) of this cluster! Please check the values of "yarn.scheduler.maximum-allocation-mb" and/or "yarn.nodemanager.resource.memory-mb".

解决方案: 在CM或者yarn-site.xml配置文件中配置yarn.scheduler.maximum-allocation-mb为2GB,yarn.nodemanager.resource.memory-mb为2GB,重启CDH

日志无限循环:yarn.Client: Application report for application_1695024027735_0001 (state: ACCEPTED)

解决方案: yarn的资源不够,删除残留任务:yarn application -list ,yarn application -kill