解决TensorFlow程序无限制占用GPU的方法
作者:逃离那片海岸 发布时间:2021-11-22 13:13:44
标签:TensorFlow,占用,GPU
今天遇到一个奇怪的现象,使用tensorflow-gpu的时候,出现内存超额~~如果我训练什么大型数据也就算了,关键我就写了一个y=W*x…显示如下图所示:
程序如下:
import tensorflow as tf
w = tf.Variable([[1.0,2.0]])
b = tf.Variable([[2.],[3.]])
y = tf.multiply(w,b)
init_op = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init_op)
print(sess.run(y))
出错提示:
占用的内存越来越多,程序崩溃之后,整个电脑都奔溃了,因为整个显卡全被吃了
2018-06-10 18:28:00.263424: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-06-10 18:28:00.598075: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1356] Found device 0 with properties:
name: GeForce GTX 1060 major: 6 minor: 1 memoryClockRate(GHz): 1.6705
pciBusID: 0000:01:00.0
totalMemory: 6.00GiB freeMemory: 4.97GiB
2018-06-10 18:28:00.598453: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-10 18:28:01.265600: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-10 18:28:01.265826: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929] 0
2018-06-10 18:28:01.265971: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0: N
2018-06-10 18:28:01.266220: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4740 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-06-10 18:28:01.331056: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 4.63G (4970853120 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.399111: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 4.17G (4473767936 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.468293: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 3.75G (4026391040 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.533138: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 3.37G (3623751936 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.602452: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 3.04G (3261376768 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.670225: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 2.73G (2935238912 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.733120: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 2.46G (2641714944 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.800101: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 2.21G (2377543424 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.862064: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.99G (2139789056 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.925434: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.79G (1925810176 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.986180: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.61G (1733229056 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.043456: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.45G (1559906048 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.103531: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.31G (1403915520 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.168973: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.18G (1263524096 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.229387: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.06G (1137171712 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.292997: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 976.04M (1023454720 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.356714: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 878.44M (921109248 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.418167: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 790.59M (828998400 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.482394: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 711.54M (746098688 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
分析原因:
显卡驱动不是最新版本,用__驱动软件__更新一下驱动,或者自己去下载更新。
TF运行太多,注销全部程序冲洗打开。
由于TF内核编写的原因,默认占用全部的GPU去训练自己的东西,也就是像meiguo一样优先政策吧
这个时候我们得设置两个方面:
选择什么样的占用方式?优先占用__还是__按需占用
选择最大占用多少GPU,因为占用过大GPU会导致其它程序奔溃。最好在0.7以下
先更新驱动:
再设置TF程序:
注意:单独设置一个不行!按照网上大神博客试了,结果效果还是很差(占用很多GPU)
设置TF:
按需占用
最大占用70%GPU
修改代码如下:
import tensorflow as tf
w = tf.Variable([[1.0,2.0]])
b = tf.Variable([[2.],[3.]])
y = tf.multiply(w,b)
init_op = tf.global_variables_initializer()
config = tf.ConfigProto(allow_soft_placement=True)
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.7)
config.gpu_options.allow_growth = True
with tf.Session(config=config) as sess:
sess.run(init_op)
print(sess.run(y))
成功解决:
2018-06-10 18:21:17.532630: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-06-10 18:21:17.852442: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1356] Found device 0 with properties:
name: GeForce GTX 1060 major: 6 minor: 1 memoryClockRate(GHz): 1.6705
pciBusID: 0000:01:00.0
totalMemory: 6.00GiB freeMemory: 4.97GiB
2018-06-10 18:21:17.852817: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-10 18:21:18.511176: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-10 18:21:18.511397: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929] 0
2018-06-10 18:21:18.511544: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0: N
2018-06-10 18:21:18.511815: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4740 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
[[2. 4.]
[3. 6.]]
参考资料:
主要参考博客
错误实例
来源:https://blog.csdn.net/u011046017/article/details/80644094
0
投稿
猜你喜欢
- 最近的项目中大量涉及数据的预处理工作,对于ndarray的使用非常频繁。其中ndarray如何进行数值筛选,总结了几种方法。1.按某些固定值
- 背景一直对语音合成系统比较感兴趣,总想能给自己合成一点内容,比如说合成小说,把我下载的电子书播报给我听等等。语音合成系统其实就是一个基于语音
- 本文实例讲述了php+mysql开发的最简单在线题库。分享给大家供大家参考,具体如下:题库,对于教育机构,学校,在线教育,是很有必要的,网上
- py文件不是html文件,当然不能在浏览器里打开。py文件可以用任何编辑器打开,py文件是和txt一样都是普通的文本文件,只是python解
- 1.嵌入 IFrame(/assets/img/anchor.svg)]()](https://gradio.app/sharing-you
- 感谢AKA及作者。Perl 中的正则表达式正则表达式的三种形式正则表达式中的常用模式正则表达式的 8 大原则 &nbs
- Timer: 隔一定时间调用一个函数,如果想实现每隔一段时间就调用一个函数的话,就要在Timer调用的函数中,再次设置Timer。Timer
- 代码如下:<% function GetBot() '查询蜘蛛 dim s_
- ASP链接MSSQL2005的链接字符串如下:Provider=SQLNCLI;Server=.\SQLEXPRESS;Database=m
- 这篇文章主要介绍了Oracle数据库到SQL Server数据库主键的迁移过程,具体内容请参考下文。由于项目需要要将以前Oracle的数据库
- 在运维场景下,我们经常需要在服务器上用正则表达式来匹配IP地址。shell和其它编程语言一样,也可以使用正则分组捕获,不过不能使用 $1或\
- 本文实例讲述了Python实现读取TXT文件数据并存进内置数据库SQLite3的方法。分享给大家供大家参考,具体如下:当TXT文件太大,计算
- 本文实例讲述了Python3实现的爬虫爬取数据并存入mysql数据库操作。分享给大家供大家参考,具体如下:爬一个电脑客户端的订单。罗总推荐,
- 从Python字符串中删除最后一个分号或者逗号第一种方法使用 str.rstrip() 方法从字符串中删除最后一个逗号,例如 new_str
- 首先,运行 Python 解释器,导入 re 模块并编译一个 RE:#!python Python 2.2.2 (#1, Feb 10 20
- np.random模块常用的一些方法介绍名称作用numpy.random.rand(d0, d1, …, dn)生成一
- Python encode()方法encode() 方法为字符串类型(str)提供的方法,用于将 str 类型转换成 bytes 类型,这个
- 本文为 djangorestframework-simplejwt 使用记录。(官方文档) 1. 安装 pip inst
- 不知道大家有没有这样一个烦恼,“自己的电脑总是被别人使用,又不好意思设置密码”,所以利用python设计了一个程序来实现自由管控。功能虽然简
- 本文实例讲述了Python使用matplotlib实现交换式图形显示功能。分享给大家供大家参考,具体如下:一 代码from random i