7a4f2b27bec323db966a4d3955e69fdeac99303a303200aba291b4d9873b903bd4113bd2979bb3eab0a14f96b3785906f6a0fc9eded9251ffd214c4699655b 20 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624
  1. # cacache [![npm version](https://img.shields.io/npm/v/cacache.svg)](https://npm.im/cacache) [![license](https://img.shields.io/npm/l/cacache.svg)](https://npm.im/cacache) [![Travis](https://img.shields.io/travis/zkat/cacache.svg)](https://travis-ci.org/zkat/cacache) [![AppVeyor](https://ci.appveyor.com/api/projects/status/github/zkat/cacache?svg=true)](https://ci.appveyor.com/project/zkat/cacache) [![Coverage Status](https://coveralls.io/repos/github/zkat/cacache/badge.svg?branch=latest)](https://coveralls.io/github/zkat/cacache?branch=latest)
  2. [`cacache`](https://github.com/zkat/cacache) is a Node.js library for managing
  3. local key and content address caches. It's really fast, really good at
  4. concurrency, and it will never give you corrupted data, even if cache files
  5. get corrupted or manipulated.
  6. It was originally written to be used as [npm](https://npm.im)'s local cache, but
  7. can just as easily be used on its own.
  8. _Translations: [español](README.es.md)_
  9. ## Install
  10. `$ npm install --save cacache`
  11. ## Table of Contents
  12. * [Example](#example)
  13. * [Features](#features)
  14. * [Contributing](#contributing)
  15. * [API](#api)
  16. * [Using localized APIs](#localized-api)
  17. * Reading
  18. * [`ls`](#ls)
  19. * [`ls.stream`](#ls-stream)
  20. * [`get`](#get-data)
  21. * [`get.stream`](#get-stream)
  22. * [`get.info`](#get-info)
  23. * [`get.hasContent`](#get-hasContent)
  24. * Writing
  25. * [`put`](#put-data)
  26. * [`put.stream`](#put-stream)
  27. * [`put*` opts](#put-options)
  28. * [`rm.all`](#rm-all)
  29. * [`rm.entry`](#rm-entry)
  30. * [`rm.content`](#rm-content)
  31. * Utilities
  32. * [`setLocale`](#set-locale)
  33. * [`clearMemoized`](#clear-memoized)
  34. * [`tmp.mkdir`](#tmp-mkdir)
  35. * [`tmp.withTmp`](#with-tmp)
  36. * Integrity
  37. * [Subresource Integrity](#integrity)
  38. * [`verify`](#verify)
  39. * [`verify.lastRun`](#verify-last-run)
  40. ### Example
  41. ```javascript
  42. const cacache = require('cacache/en')
  43. const fs = require('fs')
  44. const tarball = '/path/to/mytar.tgz'
  45. const cachePath = '/tmp/my-toy-cache'
  46. const key = 'my-unique-key-1234'
  47. // Cache it! Use `cachePath` as the root of the content cache
  48. cacache.put(cachePath, key, '10293801983029384').then(integrity => {
  49. console.log(`Saved content to ${cachePath}.`)
  50. })
  51. const destination = '/tmp/mytar.tgz'
  52. // Copy the contents out of the cache and into their destination!
  53. // But this time, use stream instead!
  54. cacache.get.stream(
  55. cachePath, key
  56. ).pipe(
  57. fs.createWriteStream(destination)
  58. ).on('finish', () => {
  59. console.log('done extracting!')
  60. })
  61. // The same thing, but skip the key index.
  62. cacache.get.byDigest(cachePath, integrityHash).then(data => {
  63. fs.writeFile(destination, data, err => {
  64. console.log('tarball data fetched based on its sha512sum and written out!')
  65. })
  66. })
  67. ```
  68. ### Features
  69. * Extraction by key or by content address (shasum, etc)
  70. * [Subresource Integrity](#integrity) web standard support
  71. * Multi-hash support - safely host sha1, sha512, etc, in a single cache
  72. * Automatic content deduplication
  73. * Fault tolerance (immune to corruption, partial writes, process races, etc)
  74. * Consistency guarantees on read and write (full data verification)
  75. * Lockless, high-concurrency cache access
  76. * Streaming support
  77. * Promise support
  78. * Pretty darn fast -- sub-millisecond reads and writes including verification
  79. * Arbitrary metadata storage
  80. * Garbage collection and additional offline verification
  81. * Thorough test coverage
  82. * There's probably a bloom filter in there somewhere. Those are cool, right? 🤔
  83. ### Contributing
  84. The cacache team enthusiastically welcomes contributions and project participation! There's a bunch of things you can do if you want to contribute! The [Contributor Guide](CONTRIBUTING.md) has all the information you need for everything from reporting bugs to contributing entire new features. Please don't hesitate to jump in if you'd like to, or even ask us questions if something isn't clear.
  85. All participants and maintainers in this project are expected to follow [Code of Conduct](CODE_OF_CONDUCT.md), and just generally be excellent to each other.
  86. Please refer to the [Changelog](CHANGELOG.md) for project history details, too.
  87. Happy hacking!
  88. ### API
  89. #### <a name="localized-api"></a> Using localized APIs
  90. cacache includes a complete API in English, with the same features as other
  91. translations. To use the English API as documented in this README, use
  92. `require('cacache/en')`. This is also currently the default if you do
  93. `require('cacache')`, but may change in the future.
  94. cacache also supports other languages! You can find the list of currently
  95. supported ones by looking in `./locales` in the source directory. You can use
  96. the API in that language with `require('cacache/<lang>')`.
  97. Want to add support for a new language? Please go ahead! You should be able to
  98. copy `./locales/en.js` and `./locales/en.json` and fill them in. Translating the
  99. `README.md` is a bit more work, but also appreciated if you get around to it. 👍🏼
  100. #### <a name="ls"></a> `> cacache.ls(cache) -> Promise<Object>`
  101. Lists info for all entries currently in the cache as a single large object. Each
  102. entry in the object will be keyed by the unique index key, with corresponding
  103. [`get.info`](#get-info) objects as the values.
  104. ##### Example
  105. ```javascript
  106. cacache.ls(cachePath).then(console.log)
  107. // Output
  108. {
  109. 'my-thing': {
  110. key: 'my-thing',
  111. integrity: 'sha512-BaSe64/EnCoDED+HAsh=='
  112. path: '.testcache/content/deadbeef', // joined with `cachePath`
  113. time: 12345698490,
  114. size: 4023948,
  115. metadata: {
  116. name: 'blah',
  117. version: '1.2.3',
  118. description: 'this was once a package but now it is my-thing'
  119. }
  120. },
  121. 'other-thing': {
  122. key: 'other-thing',
  123. integrity: 'sha1-ANothER+hasH=',
  124. path: '.testcache/content/bada55',
  125. time: 11992309289,
  126. size: 111112
  127. }
  128. }
  129. ```
  130. #### <a name="ls-stream"></a> `> cacache.ls.stream(cache) -> Readable`
  131. Lists info for all entries currently in the cache as a single large object.
  132. This works just like [`ls`](#ls), except [`get.info`](#get-info) entries are
  133. returned as `'data'` events on the returned stream.
  134. ##### Example
  135. ```javascript
  136. cacache.ls.stream(cachePath).on('data', console.log)
  137. // Output
  138. {
  139. key: 'my-thing',
  140. integrity: 'sha512-BaSe64HaSh',
  141. path: '.testcache/content/deadbeef', // joined with `cachePath`
  142. time: 12345698490,
  143. size: 13423,
  144. metadata: {
  145. name: 'blah',
  146. version: '1.2.3',
  147. description: 'this was once a package but now it is my-thing'
  148. }
  149. }
  150. {
  151. key: 'other-thing',
  152. integrity: 'whirlpool-WoWSoMuchSupport',
  153. path: '.testcache/content/bada55',
  154. time: 11992309289,
  155. size: 498023984029
  156. }
  157. {
  158. ...
  159. }
  160. ```
  161. #### <a name="get-data"></a> `> cacache.get(cache, key, [opts]) -> Promise({data, metadata, integrity})`
  162. Returns an object with the cached data, digest, and metadata identified by
  163. `key`. The `data` property of this object will be a `Buffer` instance that
  164. presumably holds some data that means something to you. I'm sure you know what
  165. to do with it! cacache just won't care.
  166. `integrity` is a [Subresource
  167. Integrity](#integrity)
  168. string. That is, a string that can be used to verify `data`, which looks like
  169. `<hash-algorithm>-<base64-integrity-hash>`.
  170. If there is no content identified by `key`, or if the locally-stored data does
  171. not pass the validity checksum, the promise will be rejected.
  172. A sub-function, `get.byDigest` may be used for identical behavior, except lookup
  173. will happen by integrity hash, bypassing the index entirely. This version of the
  174. function *only* returns `data` itself, without any wrapper.
  175. ##### Note
  176. This function loads the entire cache entry into memory before returning it. If
  177. you're dealing with Very Large data, consider using [`get.stream`](#get-stream)
  178. instead.
  179. ##### Example
  180. ```javascript
  181. // Look up by key
  182. cache.get(cachePath, 'my-thing').then(console.log)
  183. // Output:
  184. {
  185. metadata: {
  186. thingName: 'my'
  187. },
  188. integrity: 'sha512-BaSe64HaSh',
  189. data: Buffer#<deadbeef>,
  190. size: 9320
  191. }
  192. // Look up by digest
  193. cache.get.byDigest(cachePath, 'sha512-BaSe64HaSh').then(console.log)
  194. // Output:
  195. Buffer#<deadbeef>
  196. ```
  197. #### <a name="get-stream"></a> `> cacache.get.stream(cache, key, [opts]) -> Readable`
  198. Returns a [Readable Stream](https://nodejs.org/api/stream.html#stream_readable_streams) of the cached data identified by `key`.
  199. If there is no content identified by `key`, or if the locally-stored data does
  200. not pass the validity checksum, an error will be emitted.
  201. `metadata` and `integrity` events will be emitted before the stream closes, if
  202. you need to collect that extra data about the cached entry.
  203. A sub-function, `get.stream.byDigest` may be used for identical behavior,
  204. except lookup will happen by integrity hash, bypassing the index entirely. This
  205. version does not emit the `metadata` and `integrity` events at all.
  206. ##### Example
  207. ```javascript
  208. // Look up by key
  209. cache.get.stream(
  210. cachePath, 'my-thing'
  211. ).on('metadata', metadata => {
  212. console.log('metadata:', metadata)
  213. }).on('integrity', integrity => {
  214. console.log('integrity:', integrity)
  215. }).pipe(
  216. fs.createWriteStream('./x.tgz')
  217. )
  218. // Outputs:
  219. metadata: { ... }
  220. integrity: 'sha512-SoMeDIGest+64=='
  221. // Look up by digest
  222. cache.get.stream.byDigest(
  223. cachePath, 'sha512-SoMeDIGest+64=='
  224. ).pipe(
  225. fs.createWriteStream('./x.tgz')
  226. )
  227. ```
  228. #### <a name="get-info"></a> `> cacache.get.info(cache, key) -> Promise`
  229. Looks up `key` in the cache index, returning information about the entry if
  230. one exists.
  231. ##### Fields
  232. * `key` - Key the entry was looked up under. Matches the `key` argument.
  233. * `integrity` - [Subresource Integrity hash](#integrity) for the content this entry refers to.
  234. * `path` - Filesystem path relative to `cache` argument where content is stored.
  235. * `time` - Timestamp the entry was first added on.
  236. * `metadata` - User-assigned metadata associated with the entry/content.
  237. ##### Example
  238. ```javascript
  239. cacache.get.info(cachePath, 'my-thing').then(console.log)
  240. // Output
  241. {
  242. key: 'my-thing',
  243. integrity: 'sha256-MUSTVERIFY+ALL/THINGS=='
  244. path: '.testcache/content/deadbeef',
  245. time: 12345698490,
  246. size: 849234,
  247. metadata: {
  248. name: 'blah',
  249. version: '1.2.3',
  250. description: 'this was once a package but now it is my-thing'
  251. }
  252. }
  253. ```
  254. #### <a name="get-hasContent"></a> `> cacache.get.hasContent(cache, integrity) -> Promise`
  255. Looks up a [Subresource Integrity hash](#integrity) in the cache. If content
  256. exists for this `integrity`, it will return an object, with the specific single integrity hash
  257. that was found in `sri` key, and the size of the found content as `size`. If no content exists for this integrity, it will return `false`.
  258. ##### Example
  259. ```javascript
  260. cacache.get.hasContent(cachePath, 'sha256-MUSTVERIFY+ALL/THINGS==').then(console.log)
  261. // Output
  262. {
  263. sri: {
  264. source: 'sha256-MUSTVERIFY+ALL/THINGS==',
  265. algorithm: 'sha256',
  266. digest: 'MUSTVERIFY+ALL/THINGS==',
  267. options: []
  268. },
  269. size: 9001
  270. }
  271. cacache.get.hasContent(cachePath, 'sha521-NOT+IN/CACHE==').then(console.log)
  272. // Output
  273. false
  274. ```
  275. #### <a name="put-data"></a> `> cacache.put(cache, key, data, [opts]) -> Promise`
  276. Inserts data passed to it into the cache. The returned Promise resolves with a
  277. digest (generated according to [`opts.algorithms`](#optsalgorithms)) after the
  278. cache entry has been successfully written.
  279. ##### Example
  280. ```javascript
  281. fetch(
  282. 'https://registry.npmjs.org/cacache/-/cacache-1.0.0.tgz'
  283. ).then(data => {
  284. return cacache.put(cachePath, 'registry.npmjs.org|cacache@1.0.0', data)
  285. }).then(integrity => {
  286. console.log('integrity hash is', integrity)
  287. })
  288. ```
  289. #### <a name="put-stream"></a> `> cacache.put.stream(cache, key, [opts]) -> Writable`
  290. Returns a [Writable
  291. Stream](https://nodejs.org/api/stream.html#stream_writable_streams) that inserts
  292. data written to it into the cache. Emits an `integrity` event with the digest of
  293. written contents when it succeeds.
  294. ##### Example
  295. ```javascript
  296. request.get(
  297. 'https://registry.npmjs.org/cacache/-/cacache-1.0.0.tgz'
  298. ).pipe(
  299. cacache.put.stream(
  300. cachePath, 'registry.npmjs.org|cacache@1.0.0'
  301. ).on('integrity', d => console.log(`integrity digest is ${d}`))
  302. )
  303. ```
  304. #### <a name="put-options"></a> `> cacache.put options`
  305. `cacache.put` functions have a number of options in common.
  306. ##### `opts.metadata`
  307. Arbitrary metadata to be attached to the inserted key.
  308. ##### `opts.size`
  309. If provided, the data stream will be verified to check that enough data was
  310. passed through. If there's more or less data than expected, insertion will fail
  311. with an `EBADSIZE` error.
  312. ##### `opts.integrity`
  313. If present, the pre-calculated digest for the inserted content. If this option
  314. if provided and does not match the post-insertion digest, insertion will fail
  315. with an `EINTEGRITY` error.
  316. `algorithms` has no effect if this option is present.
  317. ##### `opts.algorithms`
  318. Default: ['sha512']
  319. Hashing algorithms to use when calculating the [subresource integrity
  320. digest](#integrity)
  321. for inserted data. Can use any algorithm listed in `crypto.getHashes()` or
  322. `'omakase'`/`'お任せします'` to pick a random hash algorithm on each insertion. You
  323. may also use any anagram of `'modnar'` to use this feature.
  324. Currently only supports one algorithm at a time (i.e., an array length of
  325. exactly `1`). Has no effect if `opts.integrity` is present.
  326. ##### `opts.uid`/`opts.gid`
  327. If provided, cacache will do its best to make sure any new files added to the
  328. cache use this particular `uid`/`gid` combination. This can be used,
  329. for example, to drop permissions when someone uses `sudo`, but cacache makes
  330. no assumptions about your needs here.
  331. ##### `opts.memoize`
  332. Default: null
  333. If provided, cacache will memoize the given cache insertion in memory, bypassing
  334. any filesystem checks for that key or digest in future cache fetches. Nothing
  335. will be written to the in-memory cache unless this option is explicitly truthy.
  336. If `opts.memoize` is an object or a `Map`-like (that is, an object with `get`
  337. and `set` methods), it will be written to instead of the global memoization
  338. cache.
  339. Reading from disk data can be forced by explicitly passing `memoize: false` to
  340. the reader functions, but their default will be to read from memory.
  341. #### <a name="rm-all"></a> `> cacache.rm.all(cache) -> Promise`
  342. Clears the entire cache. Mainly by blowing away the cache directory itself.
  343. ##### Example
  344. ```javascript
  345. cacache.rm.all(cachePath).then(() => {
  346. console.log('THE APOCALYPSE IS UPON US 😱')
  347. })
  348. ```
  349. #### <a name="rm-entry"></a> `> cacache.rm.entry(cache, key) -> Promise`
  350. Alias: `cacache.rm`
  351. Removes the index entry for `key`. Content will still be accessible if
  352. requested directly by content address ([`get.stream.byDigest`](#get-stream)).
  353. To remove the content itself (which might still be used by other entries), use
  354. [`rm.content`](#rm-content). Or, to safely vacuum any unused content, use
  355. [`verify`](#verify).
  356. ##### Example
  357. ```javascript
  358. cacache.rm.entry(cachePath, 'my-thing').then(() => {
  359. console.log('I did not like it anyway')
  360. })
  361. ```
  362. #### <a name="rm-content"></a> `> cacache.rm.content(cache, integrity) -> Promise`
  363. Removes the content identified by `integrity`. Any index entries referring to it
  364. will not be usable again until the content is re-added to the cache with an
  365. identical digest.
  366. ##### Example
  367. ```javascript
  368. cacache.rm.content(cachePath, 'sha512-SoMeDIGest/IN+BaSE64==').then(() => {
  369. console.log('data for my-thing is gone!')
  370. })
  371. ```
  372. #### <a name="set-locale"></a> `> cacache.setLocale(locale)`
  373. Configure the language/locale used for messages and errors coming from cacache.
  374. The list of available locales is in the `./locales` directory in the project
  375. root.
  376. _Interested in contributing more languages! [Submit a PR](CONTRIBUTING.md)!_
  377. #### <a name="clear-memoized"></a> `> cacache.clearMemoized()`
  378. Completely resets the in-memory entry cache.
  379. #### <a name="tmp-mkdir"></a> `> tmp.mkdir(cache, opts) -> Promise<Path>`
  380. Returns a unique temporary directory inside the cache's `tmp` dir. This
  381. directory will use the same safe user assignment that all the other stuff use.
  382. Once the directory is made, it's the user's responsibility that all files within
  383. are made according to the same `opts.gid`/`opts.uid` settings that would be
  384. passed in. If not, you can ask cacache to do it for you by calling
  385. [`tmp.fix()`](#tmp-fix), which will fix all tmp directory permissions.
  386. If you want automatic cleanup of this directory, use
  387. [`tmp.withTmp()`](#with-tpm)
  388. ##### Example
  389. ```javascript
  390. cacache.tmp.mkdir(cache).then(dir => {
  391. fs.writeFile(path.join(dir, 'blablabla'), Buffer#<1234>, ...)
  392. })
  393. ```
  394. #### <a name="with-tmp"></a> `> tmp.withTmp(cache, opts, cb) -> Promise`
  395. Creates a temporary directory with [`tmp.mkdir()`](#tmp-mkdir) and calls `cb`
  396. with it. The created temporary directory will be removed when the return value
  397. of `cb()` resolves -- that is, if you return a Promise from `cb()`, the tmp
  398. directory will be automatically deleted once that promise completes.
  399. The same caveats apply when it comes to managing permissions for the tmp dir's
  400. contents.
  401. ##### Example
  402. ```javascript
  403. cacache.tmp.withTmp(cache, dir => {
  404. return fs.writeFileAsync(path.join(dir, 'blablabla'), Buffer#<1234>, ...)
  405. }).then(() => {
  406. // `dir` no longer exists
  407. })
  408. ```
  409. #### <a name="integrity"></a> Subresource Integrity Digests
  410. For content verification and addressing, cacache uses strings following the
  411. [Subresource
  412. Integrity spec](https://developer.mozilla.org/en-US/docs/Web/Security/Subresource_Integrity).
  413. That is, any time cacache expects an `integrity` argument or option, it
  414. should be in the format `<hashAlgorithm>-<base64-hash>`.
  415. One deviation from the current spec is that cacache will support any hash
  416. algorithms supported by the underlying Node.js process. You can use
  417. `crypto.getHashes()` to see which ones you can use.
  418. ##### Generating Digests Yourself
  419. If you have an existing content shasum, they are generally formatted as a
  420. hexadecimal string (that is, a sha1 would look like:
  421. `5f5513f8822fdbe5145af33b64d8d970dcf95c6e`). In order to be compatible with
  422. cacache, you'll need to convert this to an equivalent subresource integrity
  423. string. For this example, the corresponding hash would be:
  424. `sha1-X1UT+IIv2+UUWvM7ZNjZcNz5XG4=`.
  425. If you want to generate an integrity string yourself for existing data, you can
  426. use something like this:
  427. ```javascript
  428. const crypto = require('crypto')
  429. const hashAlgorithm = 'sha512'
  430. const data = 'foobarbaz'
  431. const integrity = (
  432. hashAlgorithm +
  433. '-' +
  434. crypto.createHash(hashAlgorithm).update(data).digest('base64')
  435. )
  436. ```
  437. You can also use [`ssri`](https://npm.im/ssri) to have a richer set of functionality
  438. around SRI strings, including generation, parsing, and translating from existing
  439. hex-formatted strings.
  440. #### <a name="verify"></a> `> cacache.verify(cache, opts) -> Promise`
  441. Checks out and fixes up your cache:
  442. * Cleans up corrupted or invalid index entries.
  443. * Custom entry filtering options.
  444. * Garbage collects any content entries not referenced by the index.
  445. * Checks integrity for all content entries and removes invalid content.
  446. * Fixes cache ownership.
  447. * Removes the `tmp` directory in the cache and all its contents.
  448. When it's done, it'll return an object with various stats about the verification
  449. process, including amount of storage reclaimed, number of valid entries, number
  450. of entries removed, etc.
  451. ##### Options
  452. * `opts.uid` - uid to assign to cache and its contents
  453. * `opts.gid` - gid to assign to cache and its contents
  454. * `opts.filter` - receives a formatted entry. Return false to remove it.
  455. Note: might be called more than once on the same entry.
  456. ##### Example
  457. ```sh
  458. echo somegarbage >> $CACHEPATH/content/deadbeef
  459. ```
  460. ```javascript
  461. cacache.verify(cachePath).then(stats => {
  462. // deadbeef collected, because of invalid checksum.
  463. console.log('cache is much nicer now! stats:', stats)
  464. })
  465. ```
  466. #### <a name="verify-last-run"></a> `> cacache.verify.lastRun(cache) -> Promise`
  467. Returns a `Date` representing the last time `cacache.verify` was run on `cache`.
  468. ##### Example
  469. ```javascript
  470. cacache.verify(cachePath).then(() => {
  471. cacache.verify.lastRun(cachePath).then(lastTime => {
  472. console.log('cacache.verify was last called on' + lastTime)
  473. })
  474. })
  475. ```