============================= test session starts ============================== platform linux -- Python 3.9.21, pytest-6.2.5, py-1.11.0, pluggy-0.13.1 rootdir: /home/jenkins/mindspore/testcases/testcases/tests/st/auto_parallel, configfile: ../../../../../../sault/virtual_test/virtualenv_002/sault/config/pytest.ini plugins: forked-1.6.0, hydra-core-1.3.2, xdist-1.32.0, anyio-4.9.0 collected 1 item test_custom_op_parallel.py /home/jenkins/.local/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/.local/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) /home/jenkins/.local/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/.local/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) Start worker process with rank id:0, log file:./custom_op_parallel_log/worker_0.log. Environment variable [RANK_ID=0] is exported. Start worker process with rank id:1, log file:./custom_op_parallel_log/worker_1.log. Environment variable [RANK_ID=1] is exported. Start worker process with rank id:2, log file:./custom_op_parallel_log/worker_2.log. Environment variable [RANK_ID=2] is exported. Start worker process with rank id:3, log file:./custom_op_parallel_log/worker_3.log. Environment variable [RANK_ID=3] is exported. Start worker process with rank id:4, log file:./custom_op_parallel_log/worker_4.log. Environment variable [RANK_ID=4] is exported. Start worker process with rank id:5, log file:./custom_op_parallel_log/worker_5.log. Environment variable [RANK_ID=5] is exported. Start worker process with rank id:6, log file:./custom_op_parallel_log/worker_6.log. Environment variable [RANK_ID=6] is exported. Start worker process with rank id:7, log file:./custom_op_parallel_log/worker_7.log. Environment variable [RANK_ID=7] is exported. [WARNING] ME(1446132:281472914222784,MainProcess):2025-07-15-13:57:06.210.260 [mindspore/parallel/cluster/process_entity/_api.py:267] Distributed job is spawned. Waiting all processes to exit... [WARNING] DISTRIBUTED(1446209,ffffbbfbeec0,python):2025-07-15-13:57:11.586.157 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:485] Connect] Connection 21 source: 127.0.0.1:51446, destination: 127.0.0.1:10801 [WARNING] DISTRIBUTED(1446209,ffff4fffefa0,python):2025-07-15-13:57:11.586.163 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:79] ConnectedEventHandler] Connection from 127.0.0.1:51446 to 127.0.0.1:10801 is successfully created. System errno: Success [WARNING] DISTRIBUTED(1446209,ffffbbfbeec0,python):2025-07-15-13:57:11.586.277 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:494] Connect] Waiting for the state of the connection to 127.0.0.1:10801 to be connected...Retry number: 1 [WARNING] DISTRIBUTED(1446221,ffff295fefa0,python):2025-07-15-13:57:11.671.324 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:79] ConnectedEventHandler] Connection from 127.0.0.1:51452 to 127.0.0.1:10801 is successfully created. System errno: Success [WARNING] DISTRIBUTED(1446221,ffff910eeec0,python):2025-07-15-13:57:11.671.327 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:485] Connect] Connection 21 source: 127.0.0.1:51452, destination: 127.0.0.1:10801 [WARNING] DISTRIBUTED(1446221,ffff910eeec0,python):2025-07-15-13:57:11.671.578 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:485] Connect] Connection 22 source: 127.0.0.1:51454, destination: 127.0.0.1:10801 [WARNING] DISTRIBUTED(1446221,ffff2a61efa0,python):2025-07-15-13:57:11.671.610 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:79] ConnectedEventHandler] Connection from 127.0.0.1:51454 to 127.0.0.1:10801 is successfully created. System errno: Success [WARNING] DISTRIBUTED(1446221,ffff910eeec0,python):2025-07-15-13:57:11.671.624 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:494] Connect] Waiting for the state of the connection to 127.0.0.1:10801 to be connected...Retry number: 1 [WARNING] DISTRIBUTED(1446213,ffff93b8eec0,python):2025-07-15-13:57:11.785.615 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:485] Connect] Connection 21 source: 127.0.0.1:51468, destination: 127.0.0.1:10801 [WARNING] DISTRIBUTED(1446213,ffff27ffefa0,python):2025-07-15-13:57:11.785.615 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:79] ConnectedEventHandler] Connection from 127.0.0.1:51468 to 127.0.0.1:10801 is successfully created. System errno: Success [WARNING] DISTRIBUTED(1446213,ffff93b8eec0,python):2025-07-15-13:57:11.785.728 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:494] Connect] Waiting for the state of the connection to 127.0.0.1:10801 to be connected...Retry number: 1 [WARNING] DISTRIBUTED(1446225,ffff271cefa0,python):2025-07-15-13:57:11.831.296 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:79] ConnectedEventHandler] Connection from 127.0.0.1:51476 to 127.0.0.1:10801 is successfully created. System errno: Success [WARNING] DISTRIBUTED(1446225,ffff8ecbeec0,python):2025-07-15-13:57:11.831.296 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:485] Connect] Connection 21 source: 127.0.0.1:51476, destination: 127.0.0.1:10801 [WARNING] DISTRIBUTED(1446225,ffff8ecbeec0,python):2025-07-15-13:57:11.831.548 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:485] Connect] Connection 22 source: 127.0.0.1:51486, destination: 127.0.0.1:10801 [WARNING] DISTRIBUTED(1446225,ffff281eefa0,python):2025-07-15-13:57:11.831.582 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:79] ConnectedEventHandler] Connection from 127.0.0.1:51486 to 127.0.0.1:10801 is successfully created. System errno: Success [WARNING] DISTRIBUTED(1446225,ffff8ecbeec0,python):2025-07-15-13:57:11.831.594 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:494] Connect] Waiting for the state of the connection to 127.0.0.1:10801 to be connected...Retry number: 1 [WARNING] DISTRIBUTED(1446217,ffffb3e7eec0,python):2025-07-15-13:57:11.836.041 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:485] Connect] Connection 21 source: 127.0.0.1:51492, destination: 127.0.0.1:10801 [WARNING] DISTRIBUTED(1446217,ffff47ffefa0,python):2025-07-15-13:57:11.836.046 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:79] ConnectedEventHandler] Connection from 127.0.0.1:51492 to 127.0.0.1:10801 is successfully created. System errno: Success [WARNING] DISTRIBUTED(1446217,ffffb3e7eec0,python):2025-07-15-13:57:11.836.166 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:494] Connect] Waiting for the state of the connection to 127.0.0.1:10801 to be connected...Retry number: 1 [WARNING] DISTRIBUTED(1446239,ffff96adeec0,python):2025-07-15-13:57:11.936.911 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:485] Connect] Connection 21 source: 127.0.0.1:51496, destination: 127.0.0.1:10801 [WARNING] DISTRIBUTED(1446239,ffff2efdefa0,python):2025-07-15-13:57:11.936.921 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:79] ConnectedEventHandler] Connection from 127.0.0.1:51496 to 127.0.0.1:10801 is successfully created. System errno: Success [WARNING] DISTRIBUTED(1446239,ffff96adeec0,python):2025-07-15-13:57:11.937.019 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:494] Connect] Waiting for the state of the connection to 127.0.0.1:10801 to be connected...Retry number: 1 [WARNING] DISTRIBUTED(1446233,ffff5657efa0,python):2025-07-15-13:57:12.001.721 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:79] ConnectedEventHandler] Connection from 127.0.0.1:51504 to 127.0.0.1:10801 is successfully created. System errno: Success [WARNING] DISTRIBUTED(1446233,ffffbe06eec0,python):2025-07-15-13:57:12.001.728 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:485] Connect] Connection 21 source: 127.0.0.1:51504, destination: 127.0.0.1:10801 [WARNING] DISTRIBUTED(1446233,ffffbe06eec0,python):2025-07-15-13:57:12.002.009 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:485] Connect] Connection 22 source: 127.0.0.1:51506, destination: 127.0.0.1:10801 [WARNING] DISTRIBUTED(1446233,ffff5759efa0,python):2025-07-15-13:57:12.002.040 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:79] ConnectedEventHandler] Connection from 127.0.0.1:51506 to 127.0.0.1:10801 is successfully created. System errno: Success [WARNING] DISTRIBUTED(1446233,ffffbe06eec0,python):2025-07-15-13:57:12.002.056 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:494] Connect] Waiting for the state of the connection to 127.0.0.1:10801 to be connected...Retry number: 1 [WARNING] DISTRIBUTED(1446229,ffff8663eec0,python):2025-07-15-13:57:12.004.415 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:485] Connect] Connection 21 source: 127.0.0.1:51520, destination: 127.0.0.1:10801 [WARNING] DISTRIBUTED(1446229,ffff1eb4efa0,python):2025-07-15-13:57:12.004.415 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:79] ConnectedEventHandler] Connection from 127.0.0.1:51520 to 127.0.0.1:10801 is successfully created. System errno: Success [WARNING] DISTRIBUTED(1446229,ffff8663eec0,python):2025-07-15-13:57:12.004.493 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:494] Connect] Waiting for the state of the connection to 127.0.0.1:10801 to be connected...Retry number: 1 [WARNING] DISTRIBUTED(1446209,ffffbbfbeec0,python):2025-07-15-13:57:12.086.580 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:485] Connect] Connection 22 source: 127.0.0.1:51524, destination: 127.0.0.1:10801 [WARNING] DISTRIBUTED(1446209,ffff554defa0,python):2025-07-15-13:57:12.086.601 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:79] ConnectedEventHandler] Connection from 127.0.0.1:51524 to 127.0.0.1:10801 is successfully created. System errno: Success [WARNING] DISTRIBUTED(1446209,ffffbbfbeec0,python):2025-07-15-13:57:12.086.626 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:494] Connect] Waiting for the state of the connection to 127.0.0.1:10801 to be connected...Retry number: 2 [WARNING] DISTRIBUTED(1446221,ffff910eeec0,python):2025-07-15-13:57:12.172.520 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:246] BuildCluster] Topology build timed out., retry(1/1200). [WARNING] DISTRIBUTED(1446213,ffff93b8eec0,python):2025-07-15-13:57:12.286.005 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:485] Connect] Connection 22 source: 127.0.0.1:51532, destination: 127.0.0.1:10801 [WARNING] DISTRIBUTED(1446213,ffff2d0aefa0,python):2025-07-15-13:57:12.286.028 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:79] ConnectedEventHandler] Connection from 127.0.0.1:51532 to 127.0.0.1:10801 is successfully created. System errno: Success [WARNING] DISTRIBUTED(1446213,ffff93b8eec0,python):2025-07-15-13:57:12.286.061 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:494] Connect] Waiting for the state of the connection to 127.0.0.1:10801 to be connected...Retry number: 2 [WARNING] DISTRIBUTED(1446225,ffff8ecbeec0,python):2025-07-15-13:57:12.332.170 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:246] BuildCluster] Topology build timed out., retry(1/1200). [WARNING] DISTRIBUTED(1446217,ffffb3e7eec0,python):2025-07-15-13:57:12.336.420 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:485] Connect] Connection 22 source: 127.0.0.1:51546, destination: 127.0.0.1:10801 [WARNING] DISTRIBUTED(1446217,ffffb3e7eec0,python):2025-07-15-13:57:12.336.469 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:494] Connect] Waiting for the state of the connection to 127.0.0.1:10801 to be connected...Retry number: 2 [WARNING] DISTRIBUTED(1446217,ffff4d3aefa0,python):2025-07-15-13:57:12.336.507 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:79] ConnectedEventHandler] Connection from 127.0.0.1:51546 to 127.0.0.1:10801 is successfully created. System errno: Success [WARNING] DISTRIBUTED(1446239,ffff96adeec0,python):2025-07-15-13:57:12.437.228 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:485] Connect] Connection 22 source: 127.0.0.1:51560, destination: 127.0.0.1:10801 [WARNING] DISTRIBUTED(1446239,ffff2fffefa0,python):2025-07-15-13:57:12.437.263 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:79] ConnectedEventHandler] Connection from 127.0.0.1:51560 to 127.0.0.1:10801 is successfully created. System errno: Success [WARNING] DISTRIBUTED(1446239,ffff96adeec0,python):2025-07-15-13:57:12.437.272 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:494] Connect] Waiting for the state of the connection to 127.0.0.1:10801 to be connected...Retry number: 2 [WARNING] DISTRIBUTED(1446233,ffffbe06eec0,python):2025-07-15-13:57:12.502.735 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:246] BuildCluster] Topology build timed out., retry(1/1200). [WARNING] DISTRIBUTED(1446229,ffff8663eec0,python):2025-07-15-13:57:12.504.711 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:485] Connect] Connection 22 source: 127.0.0.1:51570, destination: 127.0.0.1:10801 [WARNING] DISTRIBUTED(1446229,ffff8663eec0,python):2025-07-15-13:57:12.504.753 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:494] Connect] Waiting for the state of the connection to 127.0.0.1:10801 to be connected...Retry number: 2 [WARNING] DISTRIBUTED(1446229,ffff1fb6efa0,python):2025-07-15-13:57:12.504.750 [mindspore/ccsrc/distributed/rpc/tcp/tcp_comm.cc:79] ConnectedEventHandler] Connection from 127.0.0.1:51570 to 127.0.0.1:10801 is successfully created. System errno: Success [WARNING] DISTRIBUTED(1446209,ffffbbfbeec0,python):2025-07-15-13:57:12.587.120 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:246] BuildCluster] Topology build timed out., retry(1/1200). [WARNING] DISTRIBUTED(1446221,ffff910eeec0,python):2025-07-15-13:57:12.672.647 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:246] BuildCluster] Topology build timed out., retry(2/1200). [WARNING] DISTRIBUTED(1446213,ffff93b8eec0,python):2025-07-15-13:57:12.786.570 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:246] BuildCluster] Topology build timed out., retry(1/1200). [WARNING] DISTRIBUTED(1446225,ffff8ecbeec0,python):2025-07-15-13:57:12.832.287 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:246] BuildCluster] Topology build timed out., retry(2/1200). [WARNING] DISTRIBUTED(1446217,ffffb3e7eec0,python):2025-07-15-13:57:12.836.983 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:246] BuildCluster] Topology build timed out., retry(1/1200). [WARNING] DISTRIBUTED(1446239,ffff96adeec0,python):2025-07-15-13:57:12.937.739 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:246] BuildCluster] Topology build timed out., retry(1/1200). [WARNING] DISTRIBUTED(1446233,ffffbe06eec0,python):2025-07-15-13:57:13.002.849 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:246] BuildCluster] Topology build timed out., retry(2/1200). [WARNING] DISTRIBUTED(1446229,ffff8663eec0,python):2025-07-15-13:57:13.005.318 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:246] BuildCluster] Topology build timed out., retry(1/1200). [WARNING] DISTRIBUTED(1446209,ffffbbfbeec0,python):2025-07-15-13:57:13.087.236 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:246] BuildCluster] Topology build timed out., retry(2/1200). [WARNING] DISTRIBUTED(1446221,ffff910eeec0,python):2025-07-15-13:57:13.172.753 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:246] BuildCluster] Topology build timed out., retry(3/1200). [WARNING] DISTRIBUTED(1446213,ffff93b8eec0,python):2025-07-15-13:57:13.286.686 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:246] BuildCluster] Topology build timed out., retry(2/1200). [WARNING] DISTRIBUTED(1446225,ffff8ecbeec0,python):2025-07-15-13:57:13.332.395 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:246] BuildCluster] Topology build timed out., retry(3/1200). [WARNING] DISTRIBUTED(1446217,ffffb3e7eec0,python):2025-07-15-13:57:13.337.106 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:246] BuildCluster] Topology build timed out., retry(2/1200). [WARNING] DISTRIBUTED(1446239,ffff96adeec0,python):2025-07-15-13:57:13.437.846 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:246] BuildCluster] Topology build timed out., retry(2/1200). [WARNING] DISTRIBUTED(1446233,ffffbe06eec0,python):2025-07-15-13:57:13.502.964 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:246] BuildCluster] Topology build timed out., retry(3/1200). [WARNING] DISTRIBUTED(1446229,ffff8663eec0,python):2025-07-15-13:57:13.505.457 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:249] BuildCluster] Cluster is successfully initialized. [WARNING] DISTRIBUTED(1446229,ffff8663eec0,python):2025-07-15-13:57:13.505.502 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:355] PostProcess] This node 5 rank id: 5 [WARNING] DISTRIBUTED(1446209,ffffbbfbeec0,python):2025-07-15-13:57:13.587.378 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:249] BuildCluster] Cluster is successfully initialized. [WARNING] DISTRIBUTED(1446209,ffffbbfbeec0,python):2025-07-15-13:57:13.587.423 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:355] PostProcess] This node 0 rank id: 0 [WARNING] DISTRIBUTED(1446221,ffff910eeec0,python):2025-07-15-13:57:13.672.880 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:249] BuildCluster] Cluster is successfully initialized. [WARNING] DISTRIBUTED(1446221,ffff910eeec0,python):2025-07-15-13:57:13.672.923 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:355] PostProcess] This node 3 rank id: 3 [WARNING] DISTRIBUTED(1446213,ffff93b8eec0,python):2025-07-15-13:57:13.786.821 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:249] BuildCluster] Cluster is successfully initialized. [WARNING] DISTRIBUTED(1446213,ffff93b8eec0,python):2025-07-15-13:57:13.786.866 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:355] PostProcess] This node 1 rank id: 1 [WARNING] DISTRIBUTED(1446225,ffff8ecbeec0,python):2025-07-15-13:57:13.832.521 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:249] BuildCluster] Cluster is successfully initialized. [WARNING] DISTRIBUTED(1446225,ffff8ecbeec0,python):2025-07-15-13:57:13.832.570 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:355] PostProcess] This node 4 rank id: 4 [WARNING] DISTRIBUTED(1446217,ffffb3e7eec0,python):2025-07-15-13:57:13.837.289 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:249] BuildCluster] Cluster is successfully initialized. [WARNING] DISTRIBUTED(1446217,ffffb3e7eec0,python):2025-07-15-13:57:13.837.361 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:355] PostProcess] This node 2 rank id: 2 [WARNING] DISTRIBUTED(1446239,ffff96adeec0,python):2025-07-15-13:57:13.938.013 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:249] BuildCluster] Cluster is successfully initialized. [WARNING] DISTRIBUTED(1446239,ffff96adeec0,python):2025-07-15-13:57:13.938.072 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:355] PostProcess] This node 7 rank id: 7 [WARNING] DISTRIBUTED(1446233,ffffbe06eec0,python):2025-07-15-13:57:14.003.198 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:249] BuildCluster] Cluster is successfully initialized. [WARNING] DISTRIBUTED(1446233,ffffbe06eec0,python):2025-07-15-13:57:14.003.294 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:355] PostProcess] This node 6 rank id: 6 [WARNING] DISTRIBUTED(1446229,ffff8663eec0,python):2025-07-15-13:57:15.256.769 [mindspore/ccsrc/distributed/collective/collective_manager.cc:341] CreateCommunicationGroup] Start to create communication group: hccl_world_group [const vector]{0, 1, 2, 3, 4, 5, 6, 7}, async: 1, submit_now: 1 [WARNING] DISTRIBUTED(1446229,ffff8663eec0,python):2025-07-15-13:57:15.257.090 [mindspore/ccsrc/distributed/collective/collective_manager.cc:393] CreateCommunicationGroup] This group's communicator is async created hccl_world_group [WARNING] DEVICE(1446229,fffed127efa0,python):2025-07-15-13:57:15.257.363 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:254] SetGlobalCommInfo] Start to SetGlobalCommInfo for hccl_world_group, master_ip:2130706433, master_port:10801, node_rank:2130706433, total_rank_size:8, local_rank_size8 [WARNING] HCCL_ADPT(1446229,fffed127efa0,python):2025-07-15-13:57:15.257.463 [mindspore/ccsrc/utils/dlopen_macro.h:165] DlsymAscend] Dynamically load symbol HcclSetGlobalCommInfo failed, result = /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/communication/../lib/plugin/ascend/libhccl_plugin.so: undefined symbol: HcclSetGlobalCommInfo [WARNING] HCCL_ADPT(1446229,fffed127efa0,python):2025-07-15-13:57:15.257.502 [mindspore/ccsrc/plugin/res_manager/ascend/hccl_adapter/hccl_adapter.cc:635] HcclSetGlobalCommInfo] Func HcclSetGlobalCommInfo is not supported in CANN package. [WARNING] DEVICE(1446229,fffed127efa0,python):2025-07-15-13:57:15.257.531 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:265] SetGlobalCommInfo] End to SetGlobalCommInfo for hccl_world_group [WARNING] DEVICE(1446229,fffed127efa0,python):2025-07-15-13:57:15.258.201 [mindspore/ccsrc/plugin/device/cpu/hal/hardware/ms_collective_comm_lib.cc:251] QueryUniqueID] Retry to lookup the unique id for group hccl_world_group from the meta server node...Retry time: 399/400, sleep 1 ============================= test session starts ============================== platform linux -- Python 3.9.21, pytest-6.2.5, py-1.11.0, pluggy-0.13.1 rootdir: /home/jenkins/mindspore/testcases/testcases/tests/st/auto_parallel plugins: forked-1.6.0, hydra-core-1.3.2, xdist-1.32.0, anyio-4.9.0 collected 4 items custom_op_parallel.py [WARNING] DISTRIBUTED(1446209,ffffbbfbeec0,python):2025-07-15-13:57:15.362.660 [mindspore/ccsrc/distributed/collective/collective_manager.cc:341] CreateCommunicationGroup] Start to create communication group: hccl_world_group [const vector]{0, 1, 2, 3, 4, 5, 6, 7}, async: 1, submit_now: 1 [WARNING] DISTRIBUTED(1446209,ffffbbfbeec0,python):2025-07-15-13:57:15.362.887 [mindspore/ccsrc/distributed/collective/collective_manager.cc:393] CreateCommunicationGroup] This group's communicator is async created hccl_world_group [WARNING] DEVICE(1446209,fffefd06efa0,python):2025-07-15-13:57:15.363.114 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:254] SetGlobalCommInfo] Start to SetGlobalCommInfo for hccl_world_group, master_ip:2130706433, master_port:10801, node_rank:2130706433, total_rank_size:8, local_rank_size8 [WARNING] HCCL_ADPT(1446209,fffefd06efa0,python):2025-07-15-13:57:15.363.219 [mindspore/ccsrc/utils/dlopen_macro.h:165] DlsymAscend] Dynamically load symbol HcclSetGlobalCommInfo failed, result = /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/communication/../lib/plugin/ascend/libhccl_plugin.so: undefined symbol: HcclSetGlobalCommInfo [WARNING] HCCL_ADPT(1446209,fffefd06efa0,python):2025-07-15-13:57:15.363.255 [mindspore/ccsrc/plugin/res_manager/ascend/hccl_adapter/hccl_adapter.cc:635] HcclSetGlobalCommInfo] Func HcclSetGlobalCommInfo is not supported in CANN package. [WARNING] DEVICE(1446209,fffefd06efa0,python):2025-07-15-13:57:15.363.284 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:265] SetGlobalCommInfo] End to SetGlobalCommInfo for hccl_world_group [WARNING] DISTRIBUTED(1446209,fffefd06efa0,python):2025-07-15-13:57:15.371.340 [mindspore/ccsrc/distributed/collective/collective_manager.cc:1021] CreateDeviceCommunicator] Begin initialize communication group on the device side: hccl_world_group [WARNING] DEVICE(1446209,fffeaefdefa0,python):2025-07-15-13:57:15.371.655 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:169] InitByRootInfoConfig] Start to initialize communicator by HcclCommInitRootInfoConfig for hccl_world_group, hcclBufferSize is 200 MB, hcclDeterministic is 0 ============================= test session starts ============================== platform linux -- Python 3.9.21, pytest-6.2.5, py-1.11.0, pluggy-0.13.1 rootdir: /home/jenkins/mindspore/testcases/testcases/tests/st/auto_parallel plugins: forked-1.6.0, hydra-core-1.3.2, xdist-1.32.0, anyio-4.9.0 collected 4 items custom_op_parallel.py [WARNING] DISTRIBUTED(1446221,ffff910eeec0,python):2025-07-15-13:57:15.452.502 [mindspore/ccsrc/distributed/collective/collective_manager.cc:341] CreateCommunicationGroup] Start to create communication group: hccl_world_group [const vector]{0, 1, 2, 3, 4, 5, 6, 7}, async: 1, submit_now: 1 [WARNING] DISTRIBUTED(1446221,ffff910eeec0,python):2025-07-15-13:57:15.452.809 [mindspore/ccsrc/distributed/collective/collective_manager.cc:393] CreateCommunicationGroup] This group's communicator is async created hccl_world_group [WARNING] DEVICE(1446221,fffed22eefa0,python):2025-07-15-13:57:15.453.098 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:254] SetGlobalCommInfo] Start to SetGlobalCommInfo for hccl_world_group, master_ip:2130706433, master_port:10801, node_rank:2130706433, total_rank_size:8, local_rank_size8 [WARNING] HCCL_ADPT(1446221,fffed22eefa0,python):2025-07-15-13:57:15.453.191 [mindspore/ccsrc/utils/dlopen_macro.h:165] DlsymAscend] Dynamically load symbol HcclSetGlobalCommInfo failed, result = /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/communication/../lib/plugin/ascend/libhccl_plugin.so: undefined symbol: HcclSetGlobalCommInfo [WARNING] HCCL_ADPT(1446221,fffed22eefa0,python):2025-07-15-13:57:15.453.229 [mindspore/ccsrc/plugin/res_manager/ascend/hccl_adapter/hccl_adapter.cc:635] HcclSetGlobalCommInfo] Func HcclSetGlobalCommInfo is not supported in CANN package. [WARNING] DEVICE(1446221,fffed22eefa0,python):2025-07-15-13:57:15.453.260 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:265] SetGlobalCommInfo] End to SetGlobalCommInfo for hccl_world_group [WARNING] DISTRIBUTED(1446221,fffed22eefa0,python):2025-07-15-13:57:15.453.665 [mindspore/ccsrc/distributed/collective/collective_manager.cc:1021] CreateDeviceCommunicator] Begin initialize communication group on the device side: hccl_world_group [WARNING] DEVICE(1446221,fffed1adefa0,python):2025-07-15-13:57:15.453.973 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:169] InitByRootInfoConfig] Start to initialize communicator by HcclCommInitRootInfoConfig for hccl_world_group, hcclBufferSize is 200 MB, hcclDeterministic is 0 ============================= test session starts ============================== platform linux -- Python 3.9.21, pytest-6.2.5, py-1.11.0, pluggy-0.13.1 rootdir: /home/jenkins/mindspore/testcases/testcases/tests/st/auto_parallel plugins: forked-1.6.0, hydra-core-1.3.2, xdist-1.32.0, anyio-4.9.0 collected 4 items custom_op_parallel.py [WARNING] DISTRIBUTED(1446225,ffff8ecbeec0,python):2025-07-15-13:57:15.637.478 [mindspore/ccsrc/distributed/collective/collective_manager.cc:341] CreateCommunicationGroup] Start to create communication group: hccl_world_group [const vector]{0, 1, 2, 3, 4, 5, 6, 7}, async: 1, submit_now: 1 [WARNING] DISTRIBUTED(1446225,ffff8ecbeec0,python):2025-07-15-13:57:15.637.769 [mindspore/ccsrc/distributed/collective/collective_manager.cc:393] CreateCommunicationGroup] This group's communicator is async created hccl_world_group [WARNING] DEVICE(1446225,fffe8bffefa0,python):2025-07-15-13:57:15.638.095 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:254] SetGlobalCommInfo] Start to SetGlobalCommInfo for hccl_world_group, master_ip:2130706433, master_port:10801, node_rank:2130706433, total_rank_size:8, local_rank_size8 [WARNING] HCCL_ADPT(1446225,fffe8bffefa0,python):2025-07-15-13:57:15.638.191 [mindspore/ccsrc/utils/dlopen_macro.h:165] DlsymAscend] Dynamically load symbol HcclSetGlobalCommInfo failed, result = /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/communication/../lib/plugin/ascend/libhccl_plugin.so: undefined symbol: HcclSetGlobalCommInfo [WARNING] HCCL_ADPT(1446225,fffe8bffefa0,python):2025-07-15-13:57:15.638.226 [mindspore/ccsrc/plugin/res_manager/ascend/hccl_adapter/hccl_adapter.cc:635] HcclSetGlobalCommInfo] Func HcclSetGlobalCommInfo is not supported in CANN package. [WARNING] DEVICE(1446225,fffe8bffefa0,python):2025-07-15-13:57:15.638.255 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:265] SetGlobalCommInfo] End to SetGlobalCommInfo for hccl_world_group [WARNING] DISTRIBUTED(1446225,fffe8bffefa0,python):2025-07-15-13:57:15.638.975 [mindspore/ccsrc/distributed/collective/collective_manager.cc:1021] CreateDeviceCommunicator] Begin initialize communication group on the device side: hccl_world_group [WARNING] DEVICE(1446225,fffe8b7eefa0,python):2025-07-15-13:57:15.639.322 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:169] InitByRootInfoConfig] Start to initialize communicator by HcclCommInitRootInfoConfig for hccl_world_group, hcclBufferSize is 200 MB, hcclDeterministic is 0 ============================= test session starts ============================== platform linux -- Python 3.9.21, pytest-6.2.5, py-1.11.0, pluggy-0.13.1 rootdir: /home/jenkins/mindspore/testcases/testcases/tests/st/auto_parallel plugins: forked-1.6.0, hydra-core-1.3.2, xdist-1.32.0, anyio-4.9.0 collected 4 items custom_op_parallel.py [WARNING] DISTRIBUTED(1446217,ffffb3e7eec0,python):2025-07-15-13:57:15.647.377 [mindspore/ccsrc/distributed/collective/collective_manager.cc:341] CreateCommunicationGroup] Start to create communication group: hccl_world_group [const vector]{0, 1, 2, 3, 4, 5, 6, 7}, async: 1, submit_now: 1 [WARNING] DISTRIBUTED(1446217,ffffb3e7eec0,python):2025-07-15-13:57:15.647.594 [mindspore/ccsrc/distributed/collective/collective_manager.cc:393] CreateCommunicationGroup] This group's communicator is async created hccl_world_group [WARNING] DEVICE(1446217,fffef506efa0,python):2025-07-15-13:57:15.647.855 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:254] SetGlobalCommInfo] Start to SetGlobalCommInfo for hccl_world_group, master_ip:2130706433, master_port:10801, node_rank:2130706433, total_rank_size:8, local_rank_size8 [WARNING] HCCL_ADPT(1446217,fffef506efa0,python):2025-07-15-13:57:15.647.955 [mindspore/ccsrc/utils/dlopen_macro.h:165] DlsymAscend] Dynamically load symbol HcclSetGlobalCommInfo failed, result = /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/communication/../lib/plugin/ascend/libhccl_plugin.so: undefined symbol: HcclSetGlobalCommInfo [WARNING] HCCL_ADPT(1446217,fffef506efa0,python):2025-07-15-13:57:15.647.990 [mindspore/ccsrc/plugin/res_manager/ascend/hccl_adapter/hccl_adapter.cc:635] HcclSetGlobalCommInfo] Func HcclSetGlobalCommInfo is not supported in CANN package. [WARNING] DEVICE(1446217,fffef506efa0,python):2025-07-15-13:57:15.648.019 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:265] SetGlobalCommInfo] End to SetGlobalCommInfo for hccl_world_group [WARNING] DISTRIBUTED(1446217,fffef506efa0,python):2025-07-15-13:57:15.648.593 [mindspore/ccsrc/distributed/collective/collective_manager.cc:1021] CreateDeviceCommunicator] Begin initialize communication group on the device side: hccl_world_group [WARNING] DEVICE(1446217,fffef485efa0,python):2025-07-15-13:57:15.648.900 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:169] InitByRootInfoConfig] Start to initialize communicator by HcclCommInitRootInfoConfig for hccl_world_group, hcclBufferSize is 200 MB, hcclDeterministic is 0 ============================= test session starts ============================== platform linux -- Python 3.9.21, pytest-6.2.5, py-1.11.0, pluggy-0.13.1 rootdir: /home/jenkins/mindspore/testcases/testcases/tests/st/auto_parallel plugins: forked-1.6.0, hydra-core-1.3.2, xdist-1.32.0, anyio-4.9.0 collected 4 items custom_op_parallel.py [WARNING] DISTRIBUTED(1446239,ffff96adeec0,python):2025-07-15-13:57:15.716.582 [mindspore/ccsrc/distributed/collective/collective_manager.cc:341] CreateCommunicationGroup] Start to create communication group: hccl_world_group [const vector]{0, 1, 2, 3, 4, 5, 6, 7}, async: 1, submit_now: 1 [WARNING] DISTRIBUTED(1446239,ffff96adeec0,python):2025-07-15-13:57:15.716.920 [mindspore/ccsrc/distributed/collective/collective_manager.cc:393] CreateCommunicationGroup] This group's communicator is async created hccl_world_group [WARNING] DEVICE(1446239,fffee127efa0,python):2025-07-15-13:57:15.717.240 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:254] SetGlobalCommInfo] Start to SetGlobalCommInfo for hccl_world_group, master_ip:2130706433, master_port:10801, node_rank:2130706433, total_rank_size:8, local_rank_size8 [WARNING] HCCL_ADPT(1446239,fffee127efa0,python):2025-07-15-13:57:15.717.340 [mindspore/ccsrc/utils/dlopen_macro.h:165] DlsymAscend] Dynamically load symbol HcclSetGlobalCommInfo failed, result = /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/communication/../lib/plugin/ascend/libhccl_plugin.so: undefined symbol: HcclSetGlobalCommInfo [WARNING] HCCL_ADPT(1446239,fffee127efa0,python):2025-07-15-13:57:15.717.376 [mindspore/ccsrc/plugin/res_manager/ascend/hccl_adapter/hccl_adapter.cc:635] HcclSetGlobalCommInfo] Func HcclSetGlobalCommInfo is not supported in CANN package. [WARNING] DEVICE(1446239,fffee127efa0,python):2025-07-15-13:57:15.717.407 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:265] SetGlobalCommInfo] End to SetGlobalCommInfo for hccl_world_group [WARNING] DISTRIBUTED(1446239,fffee127efa0,python):2025-07-15-13:57:15.718.021 [mindspore/ccsrc/distributed/collective/collective_manager.cc:1021] CreateDeviceCommunicator] Begin initialize communication group on the device side: hccl_world_group [WARNING] DEVICE(1446239,fffed658efa0,python):2025-07-15-13:57:15.718.472 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:169] InitByRootInfoConfig] Start to initialize communicator by HcclCommInitRootInfoConfig for hccl_world_group, hcclBufferSize is 200 MB, hcclDeterministic is 0 [WARNING] DISTRIBUTED(1446229,fffed127efa0,python):2025-07-15-13:57:15.758.885 [mindspore/ccsrc/distributed/collective/collective_manager.cc:1021] CreateDeviceCommunicator] Begin initialize communication group on the device side: hccl_world_group ============================= test session starts ============================== platform linux -- Python 3.9.21, pytest-6.2.5, py-1.11.0, pluggy-0.13.1 rootdir: /home/jenkins/mindspore/testcases/testcases/tests/st/auto_parallel plugins: forked-1.6.0, hydra-core-1.3.2, xdist-1.32.0, anyio-4.9.0 collected 4 items custom_op_parallel.py [WARNING] DEVICE(1446229,fffec5d7efa0,python):2025-07-15-13:57:15.759.418 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:169] InitByRootInfoConfig] Start to initialize communicator by HcclCommInitRootInfoConfig for hccl_world_group, hcclBufferSize is 200 MB, hcclDeterministic is 0 [WARNING] DISTRIBUTED(1446233,ffffbe06eec0,python):2025-07-15-13:57:15.791.304 [mindspore/ccsrc/distributed/collective/collective_manager.cc:341] CreateCommunicationGroup] Start to create communication group: hccl_world_group [const vector]{0, 1, 2, 3, 4, 5, 6, 7}, async: 1, submit_now: 1 [WARNING] DISTRIBUTED(1446233,ffffbe06eec0,python):2025-07-15-13:57:15.791.648 [mindspore/ccsrc/distributed/collective/collective_manager.cc:393] CreateCommunicationGroup] This group's communicator is async created hccl_world_group [WARNING] DEVICE(1446233,fffefeefefa0,python):2025-07-15-13:57:15.791.909 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:254] SetGlobalCommInfo] Start to SetGlobalCommInfo for hccl_world_group, master_ip:2130706433, master_port:10801, node_rank:2130706433, total_rank_size:8, local_rank_size8 [WARNING] HCCL_ADPT(1446233,fffefeefefa0,python):2025-07-15-13:57:15.792.006 [mindspore/ccsrc/utils/dlopen_macro.h:165] DlsymAscend] Dynamically load symbol HcclSetGlobalCommInfo failed, result = /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/communication/../lib/plugin/ascend/libhccl_plugin.so: undefined symbol: HcclSetGlobalCommInfo [WARNING] HCCL_ADPT(1446233,fffefeefefa0,python):2025-07-15-13:57:15.792.045 [mindspore/ccsrc/plugin/res_manager/ascend/hccl_adapter/hccl_adapter.cc:635] HcclSetGlobalCommInfo] Func HcclSetGlobalCommInfo is not supported in CANN package. [WARNING] DEVICE(1446233,fffefeefefa0,python):2025-07-15-13:57:15.792.077 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:265] SetGlobalCommInfo] End to SetGlobalCommInfo for hccl_world_group [WARNING] DISTRIBUTED(1446233,fffefeefefa0,python):2025-07-15-13:57:15.792.495 [mindspore/ccsrc/distributed/collective/collective_manager.cc:1021] CreateDeviceCommunicator] Begin initialize communication group on the device side: hccl_world_group [WARNING] DEVICE(1446233,fffefe6eefa0,python):2025-07-15-13:57:15.792.820 [mindspore/ccsrc/plugin/res_manager/ascend/collective/ascend_communication_group.cc:169] InitByRootInfoConfig] Start to initialize communicator by HcclCommInitRootInfoConfig for hccl_world_group, hcclBufferSize is 200 MB, hcclDeterministic is 0 ============================= test session starts ============================== platform linux -- Python 3.9.21, pytest-6.2.5, py-1.11.0, pluggy-0.13.1 rootdir: /home/jenkins/mindspore/testcases/testcases/tests/st/auto_parallel plugins: forked-1.6.0, hydra-core-1.3.2, xdist-1.32.0, anyio-4.9.0 collected 4 items custom_op_parallel.py ============================= test session starts ============================== platform linux -- Python 3.9.21, pytest-6.2.5, py-1.11.0, pluggy-0.13.1 rootdir: /home/jenkins/mindspore/testcases/testcases/tests/st/auto_parallel plugins: forked-1.6.0, hydra-core-1.3.2, xdist-1.32.0, anyio-4.9.0 collected 0 items / 1 error ==================================== ERRORS ==================================== ____________________ ERROR collecting custom_op_parallel.py ____________________ custom_op_parallel.py:25: in init() /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/communication/management.py:203: in init init_hccl() E RuntimeError: Call aclrtSetDevice failed, ret[507033]. Got device count[8] and device id[1], please check if device id is valid. E E ---------------------------------------------------- E - C++ Call Stack: (For framework developers) E ---------------------------------------------------- E mindspore/ccsrc/plugin/res_manager/ascend/hal_manager/ascend_hal_manager.cc:67 InitDevice =============================== warnings summary =============================== ../../../../../../.local/lib/python3.9/site-packages/numpy/core/getlimits.py:549 /home/jenkins/.local/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) ../../../../../../.local/lib/python3.9/site-packages/numpy/core/getlimits.py:89 /home/jenkins/.local/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) ../../../../../../.local/lib/python3.9/site-packages/numpy/core/getlimits.py:549 /home/jenkins/.local/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) ../../../../../../.local/lib/python3.9/site-packages/numpy/core/getlimits.py:89 /home/jenkins/.local/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) ../../../../../../anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/batchnorm_fold2.py:57 /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/batchnorm_fold2.py:57: DeprecationWarning: te_fusion.fusion_manager.fusion_manager.register is deprecated,please replace it with tbe.common.register.register_op_compute @fusion_manager.register("batchnorm_fold2") ../../../../../../anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/batchnorm_fold2_grad.py:56 /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/batchnorm_fold2_grad.py:56: DeprecationWarning: te_fusion.fusion_manager.fusion_manager.register is deprecated,please replace it with tbe.common.register.register_op_compute @fusion_manager.register("batchnorm_fold2_grad") ../../../../../../anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/batchnorm_fold2_grad_reduce.py:48 /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/batchnorm_fold2_grad_reduce.py:48: DeprecationWarning: te_fusion.fusion_manager.fusion_manager.register is deprecated,please replace it with tbe.common.register.register_op_compute @fusion_manager.register("batchnorm_fold2_grad_reduce") ../../../../../../anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/correction_mul.py:51 /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/correction_mul.py:51: DeprecationWarning: te_fusion.fusion_manager.fusion_manager.register is deprecated,please replace it with tbe.common.register.register_op_compute @fusion_manager.register("correction_mul") ../../../../../../anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/correction_mul_grad.py:51 /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/correction_mul_grad.py:51: DeprecationWarning: te_fusion.fusion_manager.fusion_manager.register is deprecated,please replace it with tbe.common.register.register_op_compute @fusion_manager.register("correction_mul_grad") ../../../../../../anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/correction_mul_grad.py:143 /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/correction_mul_grad.py:143: DeprecationWarning: te_fusion.fusion_manager.fusion_manager.register is deprecated,please replace it with tbe.common.register.register_op_compute @fusion_manager.register("correction_mul_grad_reduce") ../../../../../../anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/fake_learned_scale_quant_perlayer.py:50 /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/fake_learned_scale_quant_perlayer.py:50: DeprecationWarning: te_fusion.fusion_manager.fusion_manager.register is deprecated,please replace it with tbe.common.register.register_op_compute @fusion_manager.register("fake_learned_scale_quant_perlayer") ../../../../../../anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/fake_learned_scale_quant_perlayer_grad.py:92 /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/fake_learned_scale_quant_perlayer_grad.py:92: DeprecationWarning: te_fusion.fusion_manager.fusion_manager.register is deprecated,please replace it with tbe.common.register.register_op_compute @fusion_manager.register("fake_learned_scale_quant_perlayer_grad_d") ../../../../../../anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/fake_learned_scale_quant_perlayer_grad_reduce.py:49 /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/fake_learned_scale_quant_perlayer_grad_reduce.py:49: DeprecationWarning: te_fusion.fusion_manager.fusion_manager.register is deprecated,please replace it with tbe.common.register.register_op_compute @fusion_manager.register("fake_learned_scale_quant_perlayer_grad_d_reduce") ../../../../../../anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/fake_learned_scale_quant_perchannel.py:50 /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/fake_learned_scale_quant_perchannel.py:50: DeprecationWarning: te_fusion.fusion_manager.fusion_manager.register is deprecated,please replace it with tbe.common.register.register_op_compute @fusion_manager.register("fake_learned_scale_quant_perchannel") ../../../../../../anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/fake_learned_scale_quant_perchannel_grad.py:91 /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/fake_learned_scale_quant_perchannel_grad.py:91: DeprecationWarning: te_fusion.fusion_manager.fusion_manager.register is deprecated,please replace it with tbe.common.register.register_op_compute @fusion_manager.register("fake_learned_scale_quant_perchannel_grad_d") ../../../../../../anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/fake_learned_scale_quant_perchannel_grad_reduce.py:48 /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/fake_learned_scale_quant_perchannel_grad_reduce.py:48: DeprecationWarning: te_fusion.fusion_manager.fusion_manager.register is deprecated,please replace it with tbe.common.register.register_op_compute @fusion_manager.register("fake_learned_scale_quant_perchannel_grad_d_reduce") ../../../../../../anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/fake_quant_perchannel.py:52 /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/fake_quant_perchannel.py:52: DeprecationWarning: te_fusion.fusion_manager.fusion_manager.register is deprecated,please replace it with tbe.common.register.register_op_compute @fusion_manager.register("fake_quant_perchannel") ../../../../../../anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/fake_quant_perchannel_grad.py:81 /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/fake_quant_perchannel_grad.py:81: DeprecationWarning: te_fusion.fusion_manager.fusion_manager.register is deprecated,please replace it with tbe.common.register.register_op_compute @fusion_manager.register("fake_quant_perchannel_grad") ../../../../../../anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/fake_quant_perlayer.py:54 /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/fake_quant_perlayer.py:54: DeprecationWarning: te_fusion.fusion_manager.fusion_manager.register is deprecated,please replace it with tbe.common.register.register_op_compute @fusion_manager.register("fake_quant_per_layer") ../../../../../../anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/fake_quant_perlayer_grad.py:81 /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/fake_quant_perlayer_grad.py:81: DeprecationWarning: te_fusion.fusion_manager.fusion_manager.register is deprecated,please replace it with tbe.common.register.register_op_compute @fusion_manager.register("fake_quant_per_layer_grad") ../../../../../../anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/minmax_update_perchannel.py:50 /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/minmax_update_perchannel.py:50: DeprecationWarning: te_fusion.fusion_manager.fusion_manager.register is deprecated,please replace it with tbe.common.register.register_op_compute @fusion_manager.register("minmax_update_perchannel") ../../../../../../anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/minmax_update_perlayer.py:50 /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/minmax_update_perlayer.py:50: DeprecationWarning: te_fusion.fusion_manager.fusion_manager.register is deprecated,please replace it with tbe.common.register.register_op_compute @fusion_manager.register("minmax_update_perlayer") -- Docs: https://docs.pytest.org/en/stable/warnings.html =========================== short test summary info ============================ ERROR custom_op_parallel.py - RuntimeError: Call aclrtSetDevice failed, ret[5... !!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!! ================== 22 warnings, 1 error in 167.63s (0:02:47) =================== [WARNING] DEVICE(1446213,ffff93b8eec0,python):2025-07-15-13:59:54.342.775 [mindspore/ccsrc/plugin/device/ascend/hal/hardware/ascend_device_res_manager.cc:350] SyncAllStreams] The ascend_res_manager_ is nullptr in scenarios where it is not actually executed [ERROR] ME(1446132:281472914222784,MainProcess):2025-07-15-13:59:55.880.662 [mindspore/parallel/cluster/process_entity/_api.py:363] Worker process 1446213 exit with exception. Error code: 2. [WARNING] ME(1446132:281472914222784,MainProcess):2025-07-15-13:59:55.880.968 [mindspore/parallel/cluster/process_entity/_api.py:369] There's worker exits with exception, kill all other workers. [ERROR] ME(1446132:281472914222784,MainProcess):2025-07-15-14:00:30.360.003 [mindspore/parallel/cluster/process_entity/_api.py:382] Scheduler process 1446207 exit with exception. [ERROR] ME(1446132:281472914222784,MainProcess):2025-07-15-14:00:30.361.049 [mindspore/parallel/cluster/process_entity/_api.py:603] Time out nodes are ['0', '2', '3', '4', '5', '6', '7'] ./custom_op_parallel_log/worker_1.log-12-platform linux -- Python 3.9.21, pytest-6.2.5, py-1.11.0, pluggy-0.13.1 ./custom_op_parallel_log/worker_1.log-13-rootdir: /home/jenkins/mindspore/testcases/testcases/tests/st/auto_parallel ./custom_op_parallel_log/worker_1.log-14-plugins: forked-1.6.0, hydra-core-1.3.2, xdist-1.32.0, anyio-4.9.0 ./custom_op_parallel_log/worker_1.log-15-collected 0 items / 1 error ./custom_op_parallel_log/worker_1.log-16- ./custom_op_parallel_log/worker_1.log:17:==================================== ERRORS ==================================== ./custom_op_parallel_log/worker_1.log:18:____________________ ERROR collecting custom_op_parallel.py ____________________ ./custom_op_parallel_log/worker_1.log-19-custom_op_parallel.py:25: in ./custom_op_parallel_log/worker_1.log-20- init() ./custom_op_parallel_log/worker_1.log-21-/home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/communication/management.py:203: in init ./custom_op_parallel_log/worker_1.log-22- init_hccl() ./custom_op_parallel_log/worker_1.log:23:E RuntimeError: Call aclrtSetDevice failed, ret[507033]. Got device count[8] and device id[1], please check if device id is valid. ./custom_op_parallel_log/worker_1.log-24-E ./custom_op_parallel_log/worker_1.log-25-E ---------------------------------------------------- ./custom_op_parallel_log/worker_1.log-26-E - C++ Call Stack: (For framework developers) ./custom_op_parallel_log/worker_1.log-27-E ---------------------------------------------------- ./custom_op_parallel_log/worker_1.log-28-E mindspore/ccsrc/plugin/res_manager/ascend/hal_manager/ascend_hal_manager.cc:67 InitDevice -- ./custom_op_parallel_log/worker_1.log-115- /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/minmax_update_perlayer.py:50: DeprecationWarning: te_fusion.fusion_manager.fusion_manager.register is deprecated,please replace it with tbe.common.register.register_op_compute ./custom_op_parallel_log/worker_1.log-116- @fusion_manager.register("minmax_update_perlayer") ./custom_op_parallel_log/worker_1.log-117- ./custom_op_parallel_log/worker_1.log-118--- Docs: https://docs.pytest.org/en/stable/warnings.html ./custom_op_parallel_log/worker_1.log-119-=========================== short test summary info ============================ ./custom_op_parallel_log/worker_1.log:120:ERROR custom_op_parallel.py - RuntimeError: Call aclrtSetDevice failed, ret[5... ./custom_op_parallel_log/worker_1.log-121-!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!! ./custom_op_parallel_log/worker_1.log-122-================== 22 warnings, 1 error in 167.63s (0:02:47) =================== ./custom_op_parallel_log/worker_1.log-123-[WARNING] DEVICE(1446213,ffff93b8eec0,python):2025-07-15-13:59:54.342.775 [mindspore/ccsrc/plugin/device/ascend/hal/hardware/ascend_device_res_manager.cc:350] SyncAllStreams] The ascend_res_manager_ is nullptr in scenarios where it is not actually executed -- ./custom_op_parallel_log/scheduler.log-90-[WARNING] DISTRIBUTED(1446207,ffff84f1eec0,python):2025-07-15-14:00:13.487.276 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:154] Finalize] This log means the cluster is successfully created. Retry to finalize the node and exit cluster... ./custom_op_parallel_log/scheduler.log-91-[WARNING] DISTRIBUTED(1446207,ffff84f1eec0,python):2025-07-15-14:00:18.487.378 [mindspore/ccsrc/distributed/cluster/topology/meta_server_node.cc:98] Finalize] The meta server node can not be finalized because there are still 7 alive nodes. ./custom_op_parallel_log/scheduler.log-92-[WARNING] DISTRIBUTED(1446207,ffff84f1eec0,python):2025-07-15-14:00:18.487.420 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:154] Finalize] This log means the cluster is successfully created. Retry to finalize the node and exit cluster... ./custom_op_parallel_log/scheduler.log-93-[WARNING] DISTRIBUTED(1446207,ffff84f1eec0,python):2025-07-15-14:00:23.487.520 [mindspore/ccsrc/distributed/cluster/topology/meta_server_node.cc:98] Finalize] The meta server node can not be finalized because there are still 7 alive nodes. ./custom_op_parallel_log/scheduler.log-94-[WARNING] DISTRIBUTED(1446207,ffff84f1eec0,python):2025-07-15-14:00:23.487.561 [mindspore/ccsrc/distributed/cluster/cluster_context.cc:154] Finalize] This log means the cluster is successfully created. Retry to finalize the node and exit cluster... ./custom_op_parallel_log/scheduler.log:95:[ERROR] DISTRIBUTED(1446207,ffff1d42efa0,python):2025-07-15-14:00:26.005.551 [mindspore/ccsrc/distributed/cluster/topology/meta_server_node.cc:511] UpdateTopoState] The node: 0 is timed out. It may exit with exception, please check this node's log. ./custom_op_parallel_log/scheduler.log:96:[ERROR] DISTRIBUTED(1446207,ffff1d42efa0,python):2025-07-15-14:00:26.005.592 [mindspore/ccsrc/distributed/cluster/topology/meta_server_node.cc:511] UpdateTopoState] The node: 2 is timed out. It may exit with exception, please check this node's log. ./custom_op_parallel_log/scheduler.log:97:[ERROR] DISTRIBUTED(1446207,ffff1d42efa0,python):2025-07-15-14:00:26.005.619 [mindspore/ccsrc/distributed/cluster/topology/meta_server_node.cc:511] UpdateTopoState] The node: 3 is timed out. It may exit with exception, please check this node's log. ./custom_op_parallel_log/scheduler.log:98:[ERROR] DISTRIBUTED(1446207,ffff1d42efa0,python):2025-07-15-14:00:26.005.644 [mindspore/ccsrc/distributed/cluster/topology/meta_server_node.cc:511] UpdateTopoState] The node: 4 is timed out. It may exit with exception, please check this node's log. ./custom_op_parallel_log/scheduler.log:99:[ERROR] DISTRIBUTED(1446207,ffff1d42efa0,python):2025-07-15-14:00:26.005.685 [mindspore/ccsrc/distributed/cluster/topology/meta_server_node.cc:511] UpdateTopoState] The node: 5 is timed out. It may exit with exception, please check this node's log. ./custom_op_parallel_log/scheduler.log:100:[ERROR] DISTRIBUTED(1446207,ffff1d42efa0,python):2025-07-15-14:00:26.005.710 [mindspore/ccsrc/distributed/cluster/topology/meta_server_node.cc:511] UpdateTopoState] The node: 6 is timed out. It may exit with exception, please check this node's log. ./custom_op_parallel_log/scheduler.log:101:[ERROR] DISTRIBUTED(1446207,ffff1d42efa0,python):2025-07-15-14:00:26.005.734 [mindspore/ccsrc/distributed/cluster/topology/meta_server_node.cc:511] UpdateTopoState] The node: 7 is timed out. It may exit with exception, please check this node's log. ./custom_op_parallel_log/scheduler.log:102:[ERROR] DISTRIBUTED(1446207,ffff84f1eec0,python):2025-07-15-14:00:28.487.671 [mindspore/ccsrc/distributed/cluster/topology/meta_server_node.cc:103] Finalize] There are 7 abnormal compute graph nodes. ./custom_op_parallel_log/scheduler.log-103-============================= test session starts ============================== ./custom_op_parallel_log/scheduler.log-104-platform linux -- Python 3.9.21, pytest-6.2.5, py-1.11.0, pluggy-0.13.1 ./custom_op_parallel_log/scheduler.log-105-rootdir: /home/jenkins/mindspore/testcases/testcases/tests/st/auto_parallel ./custom_op_parallel_log/scheduler.log-106-plugins: forked-1.6.0, hydra-core-1.3.2, xdist-1.32.0, anyio-4.9.0 ./custom_op_parallel_log/scheduler.log-107-collected 0 items / 1 error ./custom_op_parallel_log/scheduler.log-108- ./custom_op_parallel_log/scheduler.log:109:==================================== ERRORS ==================================== ./custom_op_parallel_log/scheduler.log:110:____________________ ERROR collecting custom_op_parallel.py ____________________ ./custom_op_parallel_log/scheduler.log-111-custom_op_parallel.py:25: in ./custom_op_parallel_log/scheduler.log-112- init() ./custom_op_parallel_log/scheduler.log-113-/home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/communication/management.py:213: in init ./custom_op_parallel_log/scheduler.log-114- init_cluster() ./custom_op_parallel_log/scheduler.log:115:E RuntimeError: The total number of timed out node is 7. Timed out node list is: [const vector]{0, 2, 3, 4, 5, 6, 7}, worker 0 is the first one timed out, please check its log. ./custom_op_parallel_log/scheduler.log-116-E ./custom_op_parallel_log/scheduler.log-117-E ---------------------------------------------------- ./custom_op_parallel_log/scheduler.log-118-E - C++ Call Stack: (For framework developers) ./custom_op_parallel_log/scheduler.log-119-E ---------------------------------------------------- ./custom_op_parallel_log/scheduler.log-120-E mindspore/ccsrc/distributed/cluster/topology/meta_server_node.cc:517 UpdateTopoState -- ./custom_op_parallel_log/scheduler.log-207- /home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/minmax_update_perlayer.py:50: DeprecationWarning: te_fusion.fusion_manager.fusion_manager.register is deprecated,please replace it with tbe.common.register.register_op_compute ./custom_op_parallel_log/scheduler.log-208- @fusion_manager.register("minmax_update_perlayer") ./custom_op_parallel_log/scheduler.log-209- ./custom_op_parallel_log/scheduler.log-210--- Docs: https://docs.pytest.org/en/stable/warnings.html ./custom_op_parallel_log/scheduler.log-211-=========================== short test summary info ============================ ./custom_op_parallel_log/scheduler.log:212:ERROR custom_op_parallel.py - RuntimeError: The total number of timed out nod... ./custom_op_parallel_log/scheduler.log-213-!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!! ./custom_op_parallel_log/scheduler.log-214-================== 22 warnings, 1 error in 202.64s (0:03:22) =================== Traceback (most recent call last): File "/home/jenkins/anaconda3/envs/ci39/bin/msrun", line 8, in sys.exit(main()) File "/home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/parallel/cluster/run.py", line 191, in main run(args) File "/home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/parallel/cluster/run.py", line 185, in run process_manager.run() File "/home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/parallel/cluster/process_entity/_api.py", line 268, in run self.join_processes() File "/home/jenkins/anaconda3/envs/ci39/lib/python3.9/site-packages/mindspore/parallel/cluster/process_entity/_api.py", line 387, in join_processes raise RuntimeError("Distributed job exited with exception. Please check logs in " RuntimeError: Distributed job exited with exception. Please check logs in directory: ./custom_op_parallel_log. F =================================== FAILURES =================================== ________________________ test_msrun_custom_op_no_tuple _________________________ @arg_mark(plat_marks=["platform_ascend910b"], level_mark="level1", card_mark="allcards", essential_mark="essential") def test_msrun_custom_op_no_tuple(): ''' Feature: test custom op parallel Description: Test a net that consists of 10 sharded matmul ops using msrun. Expectation: Run success; results before and after enabling this feature should be the same. ''' return_code = os.system( "msrun --worker_num=8 --local_worker_num=8 --master_addr=127.0.0.1 " "--master_port=10801 --join=True --log_dir=./custom_op_parallel_log " "pytest -s custom_op_parallel.py" ) > assert return_code == 0 E assert 256 == 0 test_custom_op_parallel.py:31: AssertionError =========================== short test summary info ============================ FAILED test_custom_op_parallel.py::test_msrun_custom_op_no_tuple - assert 256... ======================== 1 failed in 211.15s (0:03:31) =========================