在Flink上启动 yarn-session.sh时出现 The number of requested virtual cores for application master 1 exceeds the maximum number of virtual cores 0 available in the Yarn Cluster.错误。
版本说明:
Hadoop: 3.3.4
Flink:1.17.1
问题
在Flink On Yarn上启动yarn-session.sh时出现如下错误:
ERROR org.apache.flink.yarn.cli.FlinkYarnSessionCli [] - Error while running the Flink session. org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:437) ~[flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:608) ~[flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main(FlinkYarnSessionCli.java:869) ~[flink-dist-1.17.1.jar:1.17.1] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_231] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_231] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878) ~[hadoop-common-3.3.4.jar:?] at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) ~[flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:869) [flink-dist-1.17.1.jar:1.17.1] Caused by: org.apache.flink.configuration.IllegalConfigurationException: The number of requested virtual cores for application master 1 exceeds the maximum number of virtual cores 0 available in the Yarn Cluster. at org.apache.flink.yarn.YarnClusterDescriptor.isReadyForDeployment(YarnClusterDescriptor.java:338) ~[flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:567) ~[flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:430) ~[flink-dist-1.17.1.jar:1.17.1] ... 7 more ------------------------------------------------------------ The program finished with the following exception: org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:437) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:608) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main(FlinkYarnSessionCli.java:869) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878) at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:869) Caused by: org.apache.flink.configuration.IllegalConfigurationException: The number of requested virtual cores for application master 1 exceeds the maximum number of virtual cores 0 available in the Yarn Cluster. at org.apache.flink.yarn.YarnClusterDescriptor.isReadyForDeployment(YarnClusterDescriptor.java:338) at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:567) at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:430) ... 7 more
原因
在yarn-site.xml文件中配置了所有可能相关的参数,重启yarn服务,执行yarn-session.sh错误依旧:
yarn.containers.vcores 8 yarn.nodemanager.resource.cpu-vcores 4 yarn.scheduler.maximum-allocation-vcores 2
在看yarn cluster上的信息时突然发现Unhealth Nodes,然后查看了具体信息:
具体原因就是磁盘使用空间占比超过了90了(yarn默认为90),则认为不健康,不健康相当于这个节点不可用,由于本地只有一个节点,所以相当于整个集群不可用,于是就出现了开头的错误信息。
解决
根据Health-report的提示,在yarn-site.xml中添加了如下参数:
yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage 99
重启yarn,再查看节点状态为正常了,再执行flink的yarn-session.sh就可以正常启动了。
总结
在Flink中使用yarn-session时,如果出现yarn相关的错误,可以到Yarn的WebUI上查看可能的Unhealth-report和具体的错误信息,再根据具体信息调整配置后不断调试,直到解决问题。
猜你喜欢
- 4小时前大创项目推荐 深度学习火车票识别系统
- 4小时前【深度学习目标检测】十六、基于深度学习的麦穗头系统-含GUI和源码(python,yolov8)
- 4小时前vue中PC端使用高德地图 -- 实现搜索定位、地址标记、弹窗显示定位详情
- 4小时前Java接收前端请求体方式
- 4小时前若依框架详细使用
- 4小时前[Flink] Flink On Yarn(yarn-session.sh)启动错误
- 4小时前Flink SQL
- 4小时前基于STM32的四旋翼无人机项目(二):MPU6050姿态解算(含上位机3D姿态显示教学)
- 4小时前汽车座椅空调(汽车座椅空调出风口可以封掉吗)
- 1小时前悉知是什么意思(悉知是什么意思?知悉又是什么意思?)
网友评论
- 搜索
- 最新文章
- 热门文章