Skip to content

make normalize_key cpdef & add warnings for unpickle unmarshal fail#66

Merged
MOON-CLJ merged 3 commits intodouban:masterfrom
MOON-CLJ:cython_minor_fix
Feb 27, 2018
Merged

make normalize_key cpdef & add warnings for unpickle unmarshal fail#66
MOON-CLJ merged 3 commits intodouban:masterfrom
MOON-CLJ:cython_minor_fix

Conversation

@MOON-CLJ
Copy link
Contributor

@MOON-CLJ MOON-CLJ commented Jan 16, 2018

warnings.warn for unpickle & unmarshal fail, and only first time for each line of the caller code。

>>> import warnings
>>> warnings.warn("once")
__main__:1: UserWarning: once
>>> warnings.warn("once")
>>> warnings.warn("another")
__main__:1: UserWarning: another
>>> warnings.warn("another")
(venv) chenlijun@redis00 ~/libmc $ python tmp.py
db:foo
tmp.py:7: UserWarning: [libmc] unpickle failed. err:ValueError('insecure string pickle',)
  mc.get('foo')
(venv) chenlijun@redis00 ~/libmc $ cat tmp.py
import libmc
import pickle

mc = libmc.Client(['nain2:11221'], prefix=u'db:')
print mc.normalize_key('foo')
mc.set_raw('foo', pickle.dumps("aaa")[:3], 3600, 1 << 0)
mc.get('foo')

@MOON-CLJ MOON-CLJ changed the title make normalize_key cpdef & add warnings for unpickle unmarshal fail make normalize_key cpdef & add warnings for unpickle unmarshal fail[WIP] Jan 16, 2018
@MOON-CLJ MOON-CLJ force-pushed the cython_minor_fix branch 2 times, most recently from 4abf267 to aeeb318 Compare January 16, 2018 07:07
@MOON-CLJ MOON-CLJ changed the title make normalize_key cpdef & add warnings for unpickle unmarshal fail[WIP] make normalize_key cpdef & add warnings for unpickle unmarshal fail Jan 17, 2018
@MOON-CLJ MOON-CLJ force-pushed the cython_minor_fix branch 2 times, most recently from 0a82963 to c1a0626 Compare January 17, 2018 09:07
try:
dec_val = marshal.loads(dec_val)
except:
warnings.warn("[libmc] unmarshal failed %r" % dec_val)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dec_val 是 二进制应该没什么用吧,还比较大

Copy link
Contributor

@youngsofun youngsofun Jan 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不如 把key 打出来?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay,考虑把key和error都打出来。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@youngsofun 应该不用key,get 知道自己是什么key,get_multi至少知道范围。

flags[0] = _FLAG_PICKLE
except:
pass
warnings.warn("[libmc] marshal & pickle both failed %r" % val)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

担心 val 太大

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以先打个 type?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

还有 pickle 失败的信息里可能也有有用的信息

except:
pass
except Exception as e:
warnings.warn("[libmc] marshal & pickle both failed. type:%r err:%r" % (type_, e))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may lead to a log flood if the unpackable object is encoded again and again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mckelvin #66 (comment) as i explain,it only output at the first time for each msg in warnings.warn(msg)。https://gist.github.com/MOON-CLJ/0f06a1e57d872c6f5f0f9e368326d406 this gist try to make it more clear。

so i think it's enough to avoid log flood in most case。

alias = PyUnicode_AsUTF8String(alias) if alias else None
servers_.append((host, port, alias))

Py_INCREF(servers_)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mckelvin 5968657 i try to explain it in (new) commit msg。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As is described in the doc:

The PyString_AsString return a NUL-terminated representation of the contents of string. The pointer refers to the internal buffer of string, not a copy.

So I think c_hosts[i] is using the host string object, which is inside the servers_.

Copy link
Contributor Author

@MOON-CLJ MOON-CLJ Jan 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mckelvin okay, i'll revert this change.(still not very sure as follow comment)

Copy link
Contributor Author

@MOON-CLJ MOON-CLJ Jan 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mckelvin but i am still not very clear about this line https://github.com/douban/libmc/pull/66/files#diff-721f906017b85cb2e3838d0775a3138dR376

host, port, alias = servers_[i]

host, port, alias is copy or ref?

if it's copy(new variable),Py_INCREF(severs_) is not necessary。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confirm by @windreamer , it's ref。i'll revert this change。

In [14]: a = 1

In [15]: servers = [[a, 1, 2]]

In [16]: _a, _, _ = servers[0]

In [17]: id(_a)
Out[17]: 24355160

In [18]: id(a)
Out[18]: 24355160

Copy link
Contributor Author

@MOON-CLJ MOON-CLJ Jan 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mckelvin

to say the least, servers_ is a func local variable, it's lifetime will expand to the whole func。
so i doubt Py_INCREF(servers_) is just a programming style?

dec_val = marshal.loads(dec_val)
except:
except Exception as e:
warnings.warn("[libmc] unmarshal failed. err:%r" % e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same log flood issue here.

dec_val = pickle.loads(dec_val)
except:
except Exception as e:
warnings.warn("[libmc] unpickle failed. err:%r" % e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same log flood issue here.

else:
self._imp.disableConsistentFailover()

PyMem_Free(c_hosts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 3 lines are moved to earlier lines. Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mckelvin see 5968657 also

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mckelvin
Copy link
Contributor

I'd appreciate it if you could polish the PR and git commit message: https://chris.beams.io/posts/git-commit/

@MOON-CLJ
Copy link
Contributor Author

I'd appreciate it if you could polish the PR and git commit message: https://chris.beams.io/posts/git-commit/

thx very much , i will try。

@MOON-CLJ MOON-CLJ force-pushed the cython_minor_fix branch 4 times, most recently from 1010f18 to c9cb4d7 Compare January 30, 2018 08:47
this ci include some minor fixes:

 - c_server_addr to server_addr, because it's basestring not c char
   array.

 - c_hosts c_ports c_aliases PyMem_Free after _imp.init immediately,
   _imp.init dont take over their buffer,_imp.init use snprintf to
   make copy,so i think free as soon possible do the right thing。
@youngsofun
Copy link
Contributor

youngsofun commented Feb 27, 2018

@MOON-CLJ 从你的 demo看,"首次" 是完全基于log msg 文本的吗?
但 如果 pickle 的错误包含 object 内容,或者 如果自定义 getstate 就风险:

  1. 就可能每次都不同?
  2. 除了 log 多,waring模块实现上应该还需要 存已经 log 过的(也可能是 hash)?
  3. value print 出来 可能会很大

https://github.com/python/cpython/blob/3.6/Lib/pickle.py#L265

所以,可能需要对 err 做裁剪? 后者要引用代码证明不存在上述问题)。

另外没有 key, 定位可能比较麻烦
当然 只有 key 没有具体错误信息,也可能不好定位错误原因
调用栈的信息,可能打出来是有用的?


cython 我写的少,细节如果不放心再找田老师确认一下?

@MOON-CLJ
Copy link
Contributor Author

@youngsofun

从你的 demo看,"首次" 是完全基于log msg 文本的吗?

基本上是,即warnings.warn所在的行数加上warnings.warn的内容去重。https://gist.github.com/MOON-CLJ/0f06a1e57d872c6f5f0f9e368326d406

调用栈的信息,可能打出来是有用的?

https://gist.github.com/MOON-CLJ/0f06a1e57d872c6f5f0f9e368326d406/revisions#diff-1ad944627ac1ed425afb954cf0f6f757L46 从这里判断,对于libmc的情况,出错时,会将调用libmc get方法那一行打印出来,定位也就够了?用法定位到了,也就不用key了?

所以问题应该集中在pickle或者unpickle error的出错信息上?我再确认下。

@MOON-CLJ
Copy link
Contributor Author

对于libmc的情况,出错时,会将调用libmc get方法那一行打印出来,定位也就够了?用法定位到了,也就不用key了?

不过对于cache deco这种集中调用get的地方不适用。

@MOON-CLJ
Copy link
Contributor Author

如果自定义 getstate 就风险

验证了下,确实会有风险。

cat /tmp/test_pickle.py
import pickle


class C(object):
    def __getstate__(self):
        #raise Exception("aaaa")
        return True

    def __setstate__(self, state):
        raise Exception("aaaa")
        pass


c = C()
print c
cc = pickle.dumps(c)
print [cc]
c = pickle.loads(cc)
print c

@MOON-CLJ
Copy link
Contributor Author

@youngsofun https://gist.github.com/MOON-CLJ/0f06a1e57d872c6f5f0f9e368326d406 gist updated

how about just print value type and stack?

@youngsofun
Copy link
Contributor

@MOON-CLJ 感觉, libmc 内部 调用 decode 位置固定, key 应该可以在 上一两层的 python frame 对象中找到? 和 stack 一样 单独 log 出来即可。先要确定 不是外部的调用

@MOON-CLJ
Copy link
Contributor Author

@youngsofun https://gist.github.com/MOON-CLJ/0f06a1e57d872c6f5f0f9e368326d406#file-python-warnings-warn-test-L75 将拿key的逻辑写在这个function里感觉也有点奇怪。

@youngsofun
Copy link
Contributor

@MOON-CLJ 可以先这样上,遇到具体问题再说

@youngsofun
Copy link
Contributor

直接 改 warnings,似乎不妥? 其他人也用呢?

@MOON-CLJ
Copy link
Contributor Author

MOON-CLJ commented Feb 27, 2018

@youngsofun 嗯 也是哈。我再想想。

warning about pickle and marshal err when encode_value or decode_value
@MOON-CLJ
Copy link
Contributor Author

@youngsofun updated

@MOON-CLJ MOON-CLJ merged commit 8317ccd into douban:master Feb 27, 2018
tclh123 added a commit that referenced this pull request Feb 18, 2019
ChangeLog:

- use pickle protocol version 2 (#52)
- fix rvalue reference (#53)
- Fix tests (#55)
- golibmc_test: Fix fragile test case (#56)
- Use connection pool in golibmc(again) (#57)
- golibmc Quit ret err (#59)
- cmake: Incorporate gtest in a standard way (#58)
- Rename refactor (#60)
- sync rapidjson/itoa.h upstream fix (#65)
- split cppcheck (#67)
- split FSM_GET_BYTES_CAS (#68)
- Avoid noisy recv_err when broadcast quit (#73)
- Try to support updating some servers without affecting others (#74)
- make normalize_key cpdef & add warnings for unpickle unmarshal fail (#66)
- fix fcntl usage when set socket nonblock (#69)
- minor fix (#61)
- add condition log macro (#77)
- Define and use notWaitForRetryTimeout (#82)
- upgrade travis to gcc 7 (#91)
- add func errCodeToString (#79)
- Introduce a reconnect mechanism in waitPoll (#88)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants