How to build AI Startups

ecosystem / partners

long tail / corner cases (domain knowledge)

intelligence vs skills

when to start (neither unclear nor 100% clear)

niche market (big companies don’t want to do)

  1. The core of technology industry forecasting is to judge the timing of large-scale commercial use based on the industry foundation.
  2. Super cycles are typical characteristics of technological revolutions, driven by a core technology and a series of supporting technologies.
  3. The artificial intelligence revolution has already occurred and will combine with more supporting technologies, spreading to more fields, but the timing will vary.
  4. Areas worth paying attention to in the near term: intelligent services, robotics, MR+AI.

1.科技产业预测的核心是在产业基础上判断规模商用时点
2.超级周期是科技革命的典型特征,由一个核心技术和一系列支持技术推动形成
3.人工智能革命已经发生,会和更多支持技术结合,扩散到更多领域,但时间会有先后
4.近期值得关注的领域:智能服务、机器人、MR+AI

Give credit to employees to use the product (Uber), eat the own dog food, file and fix bugs, you will get surprise

How to evaluate an AI startup?

solid tech / product market fit (PMF) / domain expertise (domain knowledge, customer base)

partnership with big companies

KOL on social media (LinkedIn, Twitter, etc)

1.技术平台决定传播方式,社会发展决定信用积累方式
2.新平台出现的喧哗之后,会逐渐沉淀出当前社会阶段的信用体系
3.产品规模化与服务规模化会长期共存,线上平台可以积累产品/服务信用,但方式有所不同

04/19/2024

  1. 技术转型:技术正在从硬到软的替代过程中,产业爆发进入倒计时,从硬到软切换的关键点是数据的Patch化
  2. 中国窗口期:转型阶段机器人产业更需要规模化制造能力,中国企业现已到达进入市场窗口时点
  3. 产业未来方向:
  • 传统替代:传统机器人市场存在替代进空间,但可能难以超越现有巨头
  • 转型优化:工业机器人企业会持续尝试降低成本和优化性能,转向商业机器人领域,但原有路线下难成功
  • 定义新场景:未来的成功者不是简单从工业机器人转型,而是基于新场景重新设计的商业机器人

  4. 转型策略:新的机器人需要从应用需求出发,形成一个完整的复杂系统,这个过程需要整合规模化制造能力,整合软硬技术的能力

  1. 趋势观察:当前,人工智能和机器人深度结合的商业应用将浮出水面,搬运等领域只是技术测试研发,最有潜力的是智能化的机器臂,将智能操作与场景深度整合

04/28/2024

1.传统VR设备靠硬件推动,当前AR设备面临挑战,尚未破局;

2.AI革命降低了AR应用开发门槛,带来了应用拉动的机会;

3.AR硬件的成熟可能还需要两代设备的时间(四年);

4.未来需要同时关注硬件性价比和应用开发环境

Posted in Artificial Intelligence, startup | Tagged , , , | Leave a comment

[Summary] DSASDBQ

Scientific training method (goal, measurable, repeatable, predictable)

Pattern recognition (sliding window / monolithic decreasing stack / binary search)

SD

Product requirement (ask questions to clarify, back envelope estimation)

High level design (functional, non-functional => scalability, availability, reliability, latency, observability – logging / monitoring / alerting)

Low level design (LB, cache – LRU / LFU, CDN, partition – master / slave, sharding – consistent hashing, virtual node, Message queue, File storage / system)

Wrap up

BQ

FPA EP (NodeJS vs JSP)

SDUI (address service vs one API)

Register without password

Reference

Algorithm

Top 5 Coding Interview MISTAKES (from a Google Engineer)

0:38 – Jumping straight to code 

2:30 – Going Silent 

3:40 – Not preparing (correctly) 

4:24 – Tunnel Visioning 

5:54 – Not managing time

The Power of Specializing – https://www.youtube.com/watch?v=qrnM2l26ZhA

Why I focus on patterns instead of technologies – https://www.youtube.com/watch?v=F1tuoMobTfQ

Getting a Tech Job in 2024 – https://www.youtube.com/watch?v=KDetTl7_CeA

Have coding interviews gotten harder? – https://www.youtube.com/watch?v=NpvhPn-7Zh8

(supply vs demand, filter, try best and be best, luck involved)

The LeetCode Fallacy – https://www.youtube.com/watch?v=2V7yPrxJ8Ck

Leetcode – The Path to Enlightenment – https://www.youtube.com/watch?v=VHZDxOmRthE

(memorizing + understanding)

Yes, FAANG Prestige is Overrated – https://www.youtube.com/watch?v=jPFnSuxWcYI

(title / position vs work / content / capability)

NeetCode – https://github.com/neetcode-gh/leetcode/tree/main/python

Practice – https://github.com/yao23/Machine_Learning_Playground/tree/master/LeetCode

BQ

Jackson Gabbard https://www.youtube.com/@jackson-gabbard/videos

Episode 07: Intro to Behavioural Interviews https://www.youtube.com/watch?v=PJKYqLP6MRE

Posted in Uncategorized | Tagged , , , | Leave a comment

[Notes] System Design Interview – Alex Xu

Chapter 1 – Scale from zero to million of users

Keep web tier stateless

Build redundancy at every tier

Cache data as much as you can

Support multiple data centers

Host static assets in CDN

Scale your data tier by sharding

Split tiers into individual servers

Monitor your system and use automation tools

Chapter 2 – Back of the envelope estimation 

Rounding and Approximation

Write down your assumptions

Label your units

Commonly asked back-of-the-envelope estimations

Chapter 3 – A framework for system design interview

Step 1 – Understand the problem and establish design scope: 3 to 10 minutes

Step 2 – Propose high-level design and get buy-in: 10 to 15 minutes

Step 3 – Design deep dive: 10 to 25 minutes

Step 4 – Wrap up: 3 to 5 minutes

Chapter 4 – Design a rate limiter

Token bucket

Leaking bucket

Fixed window

Sliding window log

Sliding window counter

Chapter 5 – Design consistent hashing

The rehashing problem

Consistent hashing (hash space and hash ring, hash servers and keys, server lookup, add and remove a server)

Virtual nodes (find affected keys)

Minimized keys are redistributed when servers are added or removed

Easy to scale horizontally because data are more evenly distributed 

Mitigate hotspot key problem (by distributing data more evenly)

Apps: Amazon Dynamo database, Apache Cassandra, Discord, Akamai CDN, Maglev network load balancer

Chapter 6 – Design a key-value store

CAP theorem

Consistency: all clients see the same data at the same time no matter what node they connect to

Availability: any client which requests data gets a response even if some of the nodes are down

Partition Tolerance: the system continues to operate despite network partitions (a communication break between two nodes)

Goal and Solution

Ability to store big data: use consistent hashing to spread load across servers

High availability reads: data replication, multi-data center setup

High availability writes: versioning and conflict resolution with vector clocks (serverId, version)

Dataset partition: consistent hashing

Incremental scalability: consistent hashing

Heterogeneity: consistent hashing

Tunable consistency: Quorum consensus (R + W > N)

Failure detection: heartbeat sent to other nodes (mark down if no signal after the threshold time)

Handling temporary failures: sloppy quorum and hinted handoff (slack restriction, use first healthy W and R nodes, ignore down nodes, push back after old node is up)

Handling permanent failures: Merkle tree (tree level with hash, check hash from root until find the diff)

Handling data center outage: cross-data center replication

Chapter 7 – Design a unique ID generator in distributed system

Multi master replication: use DB auto_increment feature, increase by k which is the number of DB servers in use (i.e. DB s1 has ID 1, 3, 5… DB s2 has ID 2, 4, 6…)

UUID: universally unique identifier (128 bit, number and alphabetic)

Ticket Server: centralized auto_increment in a single DB server (Ticket Server) 

Twitter Snowflake ID Generator: Sign bit (1 bit) + Timestamp (41) + Datacenter ID (5) + Machine ID (5) + Sequence number (12 bits, increment by 1, reset to 0 every millisecond)

ProsCons
UUID* Generating UUID is simple* Easy to scale (web server is responsible for generating IDs they consume)* 128 bits long* IDs don’t go up with time* IDs could be non-numeric
Ticket Server* Numeric IDs* Easy to implement, works for small to medium applicationsSingle point failure (system is down if ticket server goes down)

Clock synchronization (Network Time Protocol)

Section length running (fewer sequence numbers but more timestamp bits are effective for low concurrency and long-tem applications)

High availability

Chapter 8 – Design a URL Shortener

HashMap (memory is limited)

Hash function (CRC32 / MD5 / SHA-1) + collision resolution (append a new predefined string, bloom filter)

Base62 Hash

Hash + collision resolutionBase 62 conversion
Fixed short URL lengthShort URL length is not fixed. It goes up with the ID
Does not need a unique ID generatorDepends on a unique ID generator
Collision is possible and needs to be resolvedNo collision as ID is unique
Not possible to figure out the next available short URL as it doesn’t depend on IDEasy to figure out the next available short URL if ID increments by 1 for a new entry. (This can be security concern)

Chapter 9 – Design a web crawler 

URL Frontier: store URLs to be downloaded (politeness – frequency / priority / freshness – update)

HTML Downloader: download web pages from the internet using the HTTP protocol (Robots.txt)

DNS Resolver: translate URL to IP address

Content Parser: parse and validate web page (malformed page provoke problems and waste storage) 

Content Seen?: eliminate data redundancy and shorten processing time (compare hash vs character)  

Content Storage: store HTML content (most in disk, popular ones in memory) – data type, size, access frequency, life span  

URL Extractor: parse and extract links from HTML pages (convert relative path to absolute one) 

URL Filter: exclude certain content types, file extensions, error links, and URLs in blacklisted sites

URL Seen?: keep track of URLs that are visited before or already in the Frontier, avoid adding the same URL (hash table / bloom filter)  

URL Storage: store already visited URLs  

Web crawler workflow 

DFS vs BFS

Performance optimization (distributed crawl, cache DNS resolver, geo-locality, short timeout)

Robustness (consistent hashing to add / remove servers, save crawl states and data, exception handling, data validation)

Extensibility (modules to download HTML web pages, image, audio, video, PDF, etc / detect copyright and trademark infringements)

Detect and avoid problematic content (redundant content, spider traps – infinite loop, data noise – ads, code snippets, spam URLs, etc)

Chapter 10 – Design a notification system

Notifications (mobile push notification, SMS, email)

Apple iOS push notification – APNS (Apple Push Notification Service)

Google Android push notification – FCM (Firebase Cloud Messaging)

SMS message (Twilio, Nexmo, etc)

Email (Sendgrid, Mailchimp, etc)

User (user_id, email, country_code, phone_number, created_at)

Device (device_id, device_token,  user_id, last_logged_in_at)

Service 1 to N => Notification system => Third-party services (APNS / FCM / SMS / Email providers)

Notification servers

POST https://api.exmample.com/v1/sms/send

Cache / DB / Message queues (remove component dependencies) / Workers / Third-party services

Reliability – Retry mechanism

Workers (Notification log – prevent data loss)

Notification template 

Notification setting (user_id, channel (push notification, email, sms), opt_in)

Rate limiting / Security in push notifications (appKey, appSecret – whitelist)

Monitor queued notifications / Events tracking (start, pending, sent, error, deliver, click, unsubscribe)

Chapter 11 – Design a news feed system

Feed publishing API – POST /v1/feed (content, user_id, auth_token)

Newsfeed retrieval API – GET /v1/feed (user_id, auth_token)

Web servers / Fanout service 

ProsCons
Fanout on write (news feed is pre-computed during write time)* news feed is generated in real-time and can be pushed immediately* Fetching news feed is fast (pre-computed during write time)* hotkey problem (if a users has many friends, fetching friend list and generating feeds are slow* For inactive users or those rarely log in, pre-computing news feeds waste computing resources
Fanout on read (news feed is generated during read time, on-demand model)* For inactive users or those rarely log in, fanout on read works better (not waste computing resources)* No hotkey problem (data is not pushed to friends)Fetching news feed is flow (not pre-computed)

Hybrid approach (push for most users, pull for celebrities or users who have many friends / followers)

Fanout service

1 Fetch friend IDs from the graph database (Neo4j)

2 Get friends info from the user cache / DB

3 Send friends list and new post ID to the message queue

4 Fanout workers fetch data from the message queue and store news feed data in the news feed cache

5 Store <post_id, user_id> in news feed cache

Newsfeed retrieval

1 A user sends a request to retrieve its news feed (/v1/feed)

2 The load balancer redistributes requests to web servers

3 Web servers call the news feed service to fetch news feeds

4 News feed service gets a list post IDs front the news feed cache

5 The news feed service fetches the complete user and post objects from caches (userName, profile picture, post content, images, etc)

6 The fully hydrated news feed is returned in JSON format back to the client to render

Cache architecture

News Feed (news feed)

Content (hot cache, normal)

Social Graph (follower, following)

Action (liked, replied, others)

Counters (like, reply, others)

Scaling DB (Vertical scaling vs Horizontal scaling, SQL vs NoSQL, Master-slave replication, Read replicas, Consistency models, DB sharding)

Chapter 12 – Design a chat system 

Requirements

A one-one-one chat with low delivery latency

Small group chat (max of 100 people)

Online presence indicator (green dot beside profile picture)

Multi device support (laptop, phone, tablet)

Push notification

High-level design 

Send => chat service (store and relay message) => receiver

Polling: client periodically asks the server if there are messages available (costly, inefficient)

Long polling: client holds the connection open until there are actually new messages available or timeout threshold reached. Once a client receives new messages, it immediately sends another request, restarting the process. (Sender and receiver may not connect to the same chat server in stateless architecture, a server has no good way to tell if a client is disconnected, inefficient if a user doesn’t chat much)

Web Socket: most common solution for sending asynchronous updates from server to client. (Connection is initiated by the client, bi-directional and persistent. It starts its life as a HTTP connection and could be “upgraded” via some well-defined handshake to a WebSocket connection)

Stateless: service discovery (Apache Zookeeper) / Authentication / Group management / User profile

Stateful: chat service (Web Socket)

Third party: push notification

Scalability

Chat servers: message sending / receiving

Presence servers: manage online / offline status

API servers: user login, sign up, change profile, etc

Notification servers: send push notifications

Key-value store: store chat history (easy horizontal scaling, low latency, handle long tail data, HBase / Cassandra)

Data models

1on1 chat (message – message_id, message_from, message_to, content, created_at)

Group chat (group message – channel_id, message_id, message_from, content, created_at)

Message synchronization across multi devices (cur_max_message_id in diff devices, compare with latest one and only send fresh new messages)

Small group chat (message queue)

Online presence (sign in, log out, heart beat => disconnection)

Wrap up

Multi media file (text, image, audio, video, etc) – compression, cloud storage, thumbnails

End-to-end encryption (Whatsapp)

Caching messages on the client side

Improve load time (geographically distributed network to cache user data, channels, etc)

Error handling (the chat server error, message resent mechanism)

Chapter 13 – Design a search autocomplete system

Requirements: fast response time, relevant, sorted, scalable, high available

Data gathering service: gather user input queries and aggregates them in real-time or offline

Query service: given a search query or prefix, return 5 most frequently searched terms

Intuitive idea

HashMap: key as query, value as frequency

DB SQL: 

SELECT * FROM frequency_table

WHERE query like `prefix%`

ORDER BY frequency DESC

LIMIT 5

Optimized solution

Trie data structure / Data gathering service / Query service / Scale the storage / Trie operations

Limit the max length of a prefix

Cache top search queries at each node

Data gathering service: Analytics logs – Aggregators – Aggregated data – Workers – Trie DB / Cache

Analytics logs: store raw data about search queries. Logs are append-only and are not indexed.

Aggregators: different use cases need different frequency (Twitter in real-time, Google in weekly)

Aggregated data: query content, time, frequency

Workers: servers perform async jobs at regular intervals (build trie and store in Trie DB)

Trie DB: Document store like MongoDB, KV store like DynamoDB

Trie Cache: Trie node as key, frequency as value

Query service: AJAX requests (no page refresh) / Browser caching / Data sampling (1/N logged)

Trie operations: create, update, delete

Scale the storage: sharding – Naive (aa-ag, ah-an, ao-au, av-az), Optimized (s1 – s, s2 – u/v/w/x/y/z)

Wrap up

Multi language: store Unicode in Trie nodes (Unicode is an encoding standard covers all the characters for all the writing systems of the world, modern and ancient)

Queries in one country are different from others: build tries for different countries, adopt CDNs

Support the trending (real-time) search-queries: reduce working data by sharding, change the ranking model and assign more weight to recent search queries, data may come as streams (we don’t have to access to all data at once – Hadoop MapReduce, Apache Spark Streaming / Storm / Kafka)

Chapter 14 – Design a Youtube

Requirements: upload video fast, smooth video streaming, change video quality, low infrastructure cost, high availability, scalability, reliability, clients supported (mobile apps, web browser, smart TV)

High level design

Client: computer, mobile phone, smart TV

CDN: store videos for fast streaming (original file is stored in blob file storage like Amazon S3)

API Servers: feed recommendation, generate video upload URL, update metadata DB and cache, signup

Video uploading flow

User: watch a video on devices such as computer, mobile phone, or smart TV

Load balancer: evenly distributes requests among API servers

API servers: all user requests go through API servers except video streaming 

Metadata DB: store video metadata, sharded and replicated to meet performance and high availability

Metadata Cache: cache video metadata and user objects

Original storage: store original videos via blob storage (Binary Large Object) – a collection of binary data stored as a single entity in a database management system

Transcoding servers: video encoding, convert a video format to other formats (MPEG, HLS, etc) to provide best video streams for different devices and bandwidth capabilities

Transcoded storage: blob storage storing transcoded video files

CDN: cache videos for fast streaming

Completion queue: message queue storing info about video transcoding completion events

Completion handler: a list of workers that pull event data from the completion queue and update metadata cache and DB

Flow a: upload the actual video

1 videos are uploaded to the original storage

2 Transcoding servers fetch videos from the original storage and start transcoding

3 Once transcoding is complete, following two steps are executed in parallel:

3a – Transcoded videos are sent to transcoded storage

3b – Transcoding completion events are queued in the completion queue

3a1 – Transcoded videos are distributed to CDN

3b1 – Completion handler contains a bunch of workers that continuously pull event data from the queue

3b1a and 3b1b – Completion handler updates the metadata DB and cache when video transcoding is done

4 API servers inform the client that the video is successfully uploaded and is ready for streaming

Flow b: update the metadata

While a file is being uploaded to the original storage, the client in parallel sends a request to update the video metadata. The request contains video metadata including file name, size, format, etc. API servers update the metadata cache and DB.

Video streaming flow

Popular streaming protocols:

MPEG-DASH: Moving Picture Experts Group, Dynamic Adaptive Streaming over HTTP

Apple HLS: HTTP Live Streaming

Microsoft Smooth Streaming

Adobe HTTP Dynamic Streaming (HDS)


Design deep dive

Video transcoding

Raw video consumes large amount of storage space

Many devices and browsers only support certain types of video formats

Ensure users watch high-quality video while maintain smooth playback (based on bandwidth)

Network condition can change (especially on mobile devices) – switching video quality automatically or manually based on network condition 

Encoding formats

Container: contain video file, audio and metadata (.avi, .mov or .mp4)

Codecs: compression and decompression algorithms to reduce size (H.264, VP9, HEVC)

Directed acyclic graph (DAG) model

Tasks: Inspection / Video encoding (360p/480p/720p/1080p/4k.mp4) / Thumbnail / Watermark

Video transcoding architecture

Preprocessor: video splitting (chunks), GOP alignments for old clients, DAG generation, cache data

DAG scheduler: split a DAG graph into stages of tasks and put them in the task queue

Resource manager: manage efficiency of resource allocation (task queue, worker queue, running queue, task scheduler)

Task workers: run the tasks defined in DAG (watermark, encoder, thumbnail, merger)

Temporary storage: metadata in memory, video / audio in blob storage

Encoded video: final output of the encoding pipeline (funny_720p.mp4)

System optimizations

Speed optimization: parallelize video uploading (split video into smaller chunks by GOP alignment)

Speed optimization: place upload centers close to users

Speed optimization: parallelism everywhere

Safety optimization: pre-signed upload URL (access permission to the object identified in URL)

Safety optimization: protect your videos (Digital rights management, AES encryption, watermark)

Cost-saving optimization: only serve most popular videos from CDN (others from high capacity storage video servers), no need to store many encoded versions for less popular content, short videos can be encoded on-demand. Some videos are only popular in certain regions, no need to distribute to other areas. Build your own CDN like Netflix and partner with Internet Service Provider (ISP)

Error handling (recoverable vs non-recoverable – malformed video format)

Upload error: retry a few times

Split video error: the entire video is passed to the server

Transcoding error: retry

Preprocessor error: regenerate DAG diagram

DAG scheduler error: reschedule a task

Resource manager queue down: use a replica

Task workers down: retry the task on a new worker

API server down: direct requests to a different server

Metadata cache down: access other nodes to fetch data (bring up a new server to replace the dead one)

Metadata DB down: promote one slave to act as the new master if master is down, use another slave and bring up another to replace if slave down

Wrap up

Scale the API tier: keep stateless for API servers, easy to scale horizontally 

Scale the DB: DB replication and sharding

Live streaming: diff streaming protocol (low latency),  lower requirement for parallelism, diff error handling

Video takedowns: violate copyrights, pornography or other legal acts (report => remove)

Chapter 15 – Design a Google Drive 

Requirements: add files, download files, sync files across multi devices, see file versions, share files, send notification (when a file is edited, deleted, shared)

Reliability / Fast sync speed / Bandwidth usage / Scalability / High availability 

High-level design

A web server to upload  and download files

A database to keep track of metadata (user data, login info, files info, etc)

A storage system to store files

1 upload a file to Google Drive (simple / resume upload)

https//api.example.com/files/upload?type=resumable

2 download a file from Google Drive 

https//api.example.com/files/download ({“path”: “/recipes/soup/best_soup.txt”})

3 get file versions

https//api.example.com/files/list_versions ({“path”: “/recipes/soup/best_soup.txt”, “limit”: 20})

Sync conflicts (auto vs manual resolve)

User: A user uses the application either through a browser or mobile app

Block servers: Block servers upload blocks to cloud storage. Block storage, referred to as block-level storage, is a technology to store data files on cloud-based environments. A file can be split into several blocks, each with a unique hash value, stored in our metadata database. Each block is treated as an independent object and stored in our storage system (S3). To reconstruct a file, blocks are joined in a particular order. As for the block size, we use Dropbox as a reference: it sets the maximal size of a block to 4MB.

Cloud storage: A file is split into smaller blocks and stored in cloud storage

Cold storage: a computer system designed for storing inactive data, meaning files are not accessed for a long time

Load balancer: a load balancer evenly distributes requests among API servers.

API servers: responsible for almost everything other than the uploading flow. (authentication, managing user profile, updating file metadata, etc)

Metadata database: store metadata of users, files, blocks, versions, etc. 

Metadata cache: some of the metadata are cached for fast retrieval 

Notification service: a publisher / subscriber system that allows data to be transferred from notification service to clients as certain events happen

Offline backup queue: if a client is offline and can’t pull the latest file changes, the offline backup queue stores the info so changes will be synced when the client is online

Design deep dive

Block servers (delta sync / compression / encrypt / upload)

High consistency requirement 

Data in cache replicas and the master is consistent

Invalidate caches on database write to ensure cache and DB hold the same value

Achieving strong consistency in a relational databaSe is easy because it maintains the ACID (Atomicity / Consistency / Isolation / Durability)

Metadata database

User: user_id, name, email , profile photo, etc

Device: device_id, user_id, last_logged_in_at

Workspace: root directory of a user (id, owner_id, is_shared, created_at)

File: file_id, name, relative_path, is_directory, latest_version, checksum, workspace_id, created_at, last modified

File version: id, file_id, device_id, version_number, last modified

Add file metadata

1 Client 1 sends a request to add the metadata of the new file

2 Store the new file metadata in a metadata DB and change the file upload status to “pending”

3 Notify the notification service that a new file is being added

4 The Notification service notifies relevant clients (client 2) that a file is being uploaded

Upload files to cloud storage

2.1 Client 1 uploads the content of file to block servers

2.2 Block servers chunk the files into blocks, compress, encrypt the blocks, and upload them to cloud storage

2.3 Once the file is uploaded, cloud storage triggers upload completion callback. The request is sent to API servers

2.4 File status changed to “uploaded” in Metadata DB

2.5 Notify the notification service that a file status is changed to “uploaded”

2.6 The notification service notifies relevant clients (client 2) that a file is fully uploaded

Download flow

1 Notification service informs client 2 that a file is changed somewhere else

2 Once client 2 knows that new updates are available, it sends requests to fetch metadata

3 API servers call metadata DB to fetch metadata of the changes

4 Metadata is returned to the API servers

5 Client 2 gets the metadata

6 Once the client receives the metadata, it sends requests to block servers to download blocks

7 Block servers first download blocks from cloud storage

8 Cloud storage returns blocks to the block servers

9 Client 2 downloads all the new blocks to reconstruct the file

Notification service

Dropbox uses long polling (WebSocket is suited for real-time bi-directional communication such as chat app)

Save storage space 

De-duplicate data blocks / Adopt an intelligent data backup strategy (set a limit for number of versions to store, keep valuable versions only) / Moving infrequently used data to cold storage

Failure handling

Load balancer / Block server / Cloud storage / API server / Metadata cache & DB / Notification service / Offline backup queue failure

Wrap up

If upload files directly to cloud storage from the client instead of going through block servers, it has a few drawbacks:

1 the same chunking, compression and encryption logic must be implemented on different platforms (iOS, Android, Web). It’s error-prone and requires a lot of engineering effort

2 A client can easily be hacked or manipulated, implementing encrypting logic on the client side is not ideal

Posted in CS Research&Application, distributed system, Software Engineering, Tech Blog Notes | Tagged , , , , | Leave a comment

中文播客 – 科技要闻评论

1月31日 GPT解锁新玩法!@300万个AI给你打工!每月20美元! https://www.ximalaya.com/sound/703176608

11月27日 比尔盖茨说AI让打工人一周工作3天!太夸张了? https://www.ximalaya.com/sound/688346456

9月21日 美国建厂获批、德国工厂投产!这家中国电池厂加速“外卷” https://www.ximalaya.com/sound/667470168

9月13日 马斯克个人传记来了:一面富翁一面恶魔? https://www.ximalaya.com/sound/665557511

9月12日 MBA在校生与GPT4拼创意!结果让人意想不到! https://www.ximalaya.com/sound/665221581

8月25日 GPT之后下一个热点!硅谷大佬为啥追捧AI Agent? https://www.ximalaya.com/sound/659748074

8月23日 麦肯锡7000员工开始用自家GPT办公!这里面潜藏什么趋势? https://www.ximalaya.com/sound/659273343

8月11日 绝不止AI!还有一波意想不到的大机会

https://xima.tv/1_ty2O3V?_sonic=0

Posted in Uncategorized | Tagged , , | Leave a comment

中文播客 – 果壳电台

看完就懂:智子为什么能锁住人类科技? https://www.ximalaya.com/sound/617788170

(还原论 – 物理 VS 衍生论 – 生物)

【果壳好身体】跑步时脚跟先着地,还是脚掌先着地? https://www.ximalaya.com/sound/661523835

【科学吃喝指南】为什么番茄越来越难吃? https://www.ximalaya.com/sound/669173529

【神奇动物】关于熊猫的十个真相,六个指头?尾巴是黑色? https://www.ximalaya.com/sound/681581108

为啥穿的最多,反而最冷?冬天到底怎么选衣服? https://www.ximalaya.com/sound/681571855

Posted in Uncategorized | Tagged , , | Leave a comment

中文播客 – 科学有故事

探索好奇:探索公司(Discovery)和创始人的故事 https://www.ximalaya.com/sound/674252362

收费节目免费听:《吃货科学指南》之防腐剂:被名称所累的好东西 https://www.ximalaya.com/sound/676874948

GPT-4 Turbo 来了,我与它聊了聊宇宙最根本的大问题 https://www.ximalaya.com/sound/682301918

汪诘杂谈:一个数字推知全宇宙,这可能并不是神话 https://www.ximalaya.com/sound/684988150

汪诘杂谈:我和杠精猫聊科学思维 https://www.ximalaya.com/sound/690488012

汪诘与小猫对谈:如何才能更理性 https://www.ximalaya.com/sound/704803943

听众问答:各国的官方膳食指南是否值得信任 https://www.ximalaya.com/sound/671581856

汪诘杂谈:预制菜进校园是好是坏 https://www.ximalaya.com/sound/669878354

科学有故事 为什么有的人会因心脏骤停猝死 (基因)

听众问答 2304: 增肥比减肥更难吗?谁瘦谁知道 https://www.ximalaya.com/sound/663913631

(医学上可以通过直接移植胖人肠道菌群(灌肠)的方法,想简单点就是和胖子一起生活,我老婆原来就是怎么吃都不胖那种,现在拦都拦不住了)

露营 fun 电季,新能源车小白的必备知识 https://www.ximalaya.com/sound/661257228

寻秘自然 01 – 这个著名试验,试图再现生命创生的伟大时刻 https://www.ximalaya.com/sound/335254508

Posted in Uncategorized | Tagged , , | Leave a comment

中文播客 – 思考盒子

思考盒子

409世界科学中心转移【意大利】 – https://www.ximalaya.com/sound/34383496

419世界科学中心转移【英国】 https://www.ximalaya.com/sound/34446150

421世界科学中心转移【法国】 https://www.ximalaya.com/sound/34613906

431世界科学中心转移【德国】 https://www.ximalaya.com/sound/34798850

433世界科学中心转移【美国】 https://www.ximalaya.com/sound/34844295

439世界科学中心转移【下一站】 https://www.ximalaya.com/sound/34874147

463您的【智商】余额已不足,请及时充值 https://www.ximalaya.com/sound/35831661

479您的【记忆力】余额已不足,请及时充值 https://www.ximalaya.com/sound/36272525

467您的【情商】余额已不足,请及时充值 https://www.ximalaya.com/sound/36046177

593正确打开脑洞的方式【逻辑谬误】 https://www.ximalaya.com/sound/39326627

599正确打开脑洞的方式【没有答案的终极问题】上 https://www.ximalaya.com/sound/39670141

601正确打开脑洞的方式【没有答案的终极问题】下 https://www.ximalaya.com/sound/39869016

607无法抗拒的心理效应【我的世界】 https://www.ximalaya.com/sound/40096928

613无法抗拒的心理效应【模拟人生】 https://www.ximalaya.com/sound/40312311

617无法抗拒的心理效应【求生之路】 https://www.ximalaya.com/sound/40362005

619无法抗拒的心理效应【最终幻想】 https://www.ximalaya.com/sound/40455089

631无法抗拒的心理效应【使命召唤】 https://www.ximalaya.com/sound/40887633

641无法抗拒的心理效应【红色警戒】 https://www.ximalaya.com/sound/41153126

643诸神的名字【太阳系行星】 https://www.ximalaya.com/sound/41311252

647诸神的名字【医学药品】 https://www.ximalaya.com/sound/41481560

653诸神的名字【品牌商标】 https://www.ximalaya.com/sound/41707139

659诸神的名字【赠送大家一期】 https://www.ximalaya.com/sound/42070410

661认识自己的身体【 解剖学故事】 https://www.ximalaya.com/sound/42262748

673认识自己的身体【人体的系统】 https://www.ximalaya.com/sound/42618679

677认识自己的身体【看病全攻略】 https://www.ximalaya.com/sound/42877725

691数学圈的争吵【无穷与集合论】 https://www.ximalaya.com/sound/43278687

701数学圈的争吵【数学的意义】 https://www.ximalaya.com/sound/43545162

739敢问路在何方【地图】 https://www.ximalaya.com/sound/45256791

751敢问路在何方【航空图】 https://www.ximalaya.com/sound/46021783

757敢问路在何方【航天图】 https://www.ximalaya.com/sound/46435566

761七种武器之【奥卡姆的剃刀】 https://www.ximalaya.com/sound/47090352

769七种武器之【休谟的叉子】 https://www.ximalaya.com/sound/47335803

773七种武器之【康德的眼镜】 https://www.ximalaya.com/sound/47590783

787七种武器之【波普尔的玩偶】 https://www.ximalaya.com/sound/47976841

797七种武器之【古格斯的戒指】 https://www.ximalaya.com/sound/48254012

809七种武器之【公孙龙的板砖】 https://www.ximalaya.com/sound/48709530

811七种武器之【尼采的锤子】 https://www.ximalaya.com/sound/48997511

821我读书少~不要骗我【伏尼契手稿】 https://www.ximalaya.com/sound/49394229

823我读书少~不要骗我【诸世纪】 https://www.ximalaya.com/sound/49622994

829我真的好想再活500年【拉马努金】 https://www.ximalaya.com/sound/50318148

839我真的好想再活500年【阿贝尔】 https://www.ximalaya.com/sound/50682687

853我真的好想再活500年【伽罗华】 https://www.ximalaya.com/sound/51013213

857我真的好想再活500年【帕斯卡】 https://www.ximalaya.com/sound/51271374

859我真的好想再活500年【黎曼】 https://www.ximalaya.com/sound/51942069

883有话好好说【语言的产生】 https://www.ximalaya.com/sound/53382503

887有话好好说【闲聊汉语】(完整版) https://www.ximalaya.com/sound/53583183

907有话好好说【东北话】 https://www.ximalaya.com/sound/54267866

911宇宙与人【费米悖论(上)】 https://www.ximalaya.com/sound/54871182

919宇宙与人【费米悖论(下)】 https://www.ximalaya.com/sound/55213325

929宇宙与人【对话外星人】 https://www.ximalaya.com/sound/55899545

937宇宙与人【外星人长啥样】 https://www.ximalaya.com/sound/56456434

941宇宙与人【SETI计划】 https://www.ximalaya.com/sound/57035636

977小学生的学习【奥数】 https://www.ximalaya.com/sound/59740366

983小学生的学习【作文】 https://www.ximalaya.com/sound/60042029

991小学生的学习【英语】 https://www.ximalaya.com/sound/61679340

1009中古世界奇迹【古罗马斗兽场】 https://www.ximalaya.com/sound/61810723

1013中古世界奇迹【比萨斜塔】 https://www.ximalaya.com/sound/62087338

1019中古世界奇迹【亚历山大陵墓】 https://www.ximalaya.com/sound/62834349

1021中古世界奇迹【英国巨石阵】 https://www.ximalaya.com/sound/63592530

1033药不能停【高血压】 https://www.ximalaya.com/sound/64725722

1039药不能停【糖尿病】 https://www.ximalaya.com/sound/64950232

1051兵器进化史【冷兵器】 https://www.ximalaya.com/sound/65932369

1061兵器进化史【Ye兵器-咱们篇】 https://www.ximalaya.com/sound/66860543

1063兵器进化史【Ye兵器-他们篇】 https://www.ximalaya.com/sound/67566634

1069兵器进化史【木仓】 https://www.ximalaya.com/sound/67951183

1087兵器进化史【火包】 https://www.ximalaya.com/sound/68615570

1091兵器进化史【弓单】 https://www.ximalaya.com/sound/69439537

1117信息不对称【壹】 https://www.ximalaya.com/sound/74306965

1123信息不对称【贰】 https://www.ximalaya.com/sound/75080458

1129信息孤岛【叁】 https://www.ximalaya.com/sound/75952729

1151【自由意志】我只想安静地吃一碗麻辣烫 https://www.ximalaya.com/sound/76813136

1153【霍金之死】有些故事还没讲完那就算了吧 https://www.ximalaya.com/sound/77259123

1163【吓尿指数】技术奇点何时到来? https://www.ximalaya.com/sound/78549299

1171【科技锁死】现在的努力都是为了曾经吹过的牛 https://www.ximalaya.com/sound/79639889

1181【科学的尽头】我们在谈论科学的时候我们在谈论什么 https://www.ximalaya.com/sound/80587924

25:00 (你让科学家显示一下神迹,好啊,我给你制造一个特斯拉闪电,我给你克隆一只王八出来。可以说每一次科学实验都是科学家在展现他的神迹。而上帝的神迹,从来没有出现过,只能是存在于我的梦里,我的心里,我的歌声里)

科学:始于怀疑,终于叹服

宗教:始于信仰,止于习惯

【音乐】外语里叮咣噼里啪啦的歌 https://www.ximalaya.com/sound/80963532

【音乐】在那些怎么地都睡不着的夜晚 https://www.ximalaya.com/sound/82076965

1187【鄙视链】一个贪生怕死的无奈链条 https://www.ximalaya.com/sound/81653083

1193【论中微子振荡与轻子的稀有衰变】 https://www.ximalaya.com/sound/83477721

1201【应用物理隔离阻断DNA传递的研究】 https://www.ximalaya.com/sound/84601366

电焊技术哪家强?【番外篇】 https://www.ximalaya.com/sound/82862291

1213【疼痛的终极对决】 https://www.ximalaya.com/sound/85959314

1217【生物钟】关于PER 蛋白使period 基因失去活性的研究 https://www.ximalaya.com/sound/86756271

1223【如何设计一个完美的试验】 https://www.ximalaya.com/sound/87484527

1229【定义死亡】生存还是死亡,这是一个问题 https://www.ximalaya.com/sound/88059815

1231【拉瓦锡】化学狂魔与小萝莉 https://www.ximalaya.com/sound/88939547

1237【波义耳】重新定义了化学 https://www.ximalaya.com/sound/89501973

1249【道尔顿】一位色盲症患者对于原子论的探索 https://www.ximalaya.com/sound/90461863

1259【曼德拉效应】米老鼠的裤子没有没背带 https://www.ximalaya.com/sound/91043613

1279【反馈】你的沉默,让我不安 https://www.ximalaya.com/sound/92622535

1283【狼来了】我还能相信谁? https://www.ximalaya.com/sound/93071546

1289【仪式感】生命的最后一根稻草 https://www.ximalaya.com/sound/94058031

1319【梦在远方】探索宇宙真的有用吗 https://www.ximalaya.com/sound/99322448

1321【不懂就别瞎说】 存在即合理 https://www.ximalaya.com/sound/100567869

1361【不懂就别瞎说】他人即地狱 https://www.ximalaya.com/sound/103166713

1367【不懂就别瞎说】我思故我在 https://www.ximalaya.com/sound/103938667

1409【冬眠】但愿长睡不愿醒 https://www.ximalaya.com/sound/107939837

1423【人体冷冻】逃离死神还有多远? https://www.ximalaya.com/sound/119979037

1433 【梦游】并不是做梦的时候在游走 https://www.ximalaya.com/sound/124247605

1439【不对称性】我总是喜欢用左手,怎么办?【A】 https://www.ximalaya.com/sound/125379447

1439【不对称性】我总是喜欢用左手,怎么办?【B】 https://www.ximalaya.com/sound/125394256

1439【不对称性】我总是喜欢用左手,怎么办?【C】 https://www.ximalaya.com/sound/125409404

1/2【黎曼猜想】听这一篇就够了 https://www.ximalaya.com/sound/125444399

1447【手性】左与右的差距咋就这么大呢 https://www.ximalaya.com/sound/126722065

1451【对称性破缺】上帝是左撇子吗? https://www.ximalaya.com/sound/127419439

1471【地震】你不知道的那些事 https://www.ximalaya.com/sound/130439471

1481【分形】英国的海岸线有多长 https://www.ximalaya.com/sound/131210703

1483【拓扑】只要不捅出洞就没事 https://www.ximalaya.com/sound/132516681

1487【混沌】复杂从何而来 https://www.ximalaya.com/sound/134106386

1489【湍流】经典物理学最后的未解难题 https://www.ximalaya.com/sound/135812044

1511关于趋化性细胞因子受体-5敲除后对获得性免疫缺乏综合征预防的研究进展 https://www.ximalaya.com/sound/142421130

1531【消失的国度】亚特兰蒂斯 https://www.ximalaya.com/sound/143192117

1543【消失的国度】玛雅文明(上) https://www.ximalaya.com/sound/144873429

1549【消失的国度】玛雅文明(下) https://www.ximalaya.com/sound/145658536

1553【消失的国度】庞贝古城 https://www.ximalaya.com/sound/146764701

1559【消失的国度】腓尼基文明 https://www.ximalaya.com/sound/148289457

1567【信息进化史】存储(上) https://www.ximalaya.com/sound/149016654

1571 【信息进化史】存储(中) https://www.ximalaya.com/sound/150330730

1579【信息进化史】存储(下) https://www.ximalaya.com/sound/152253354

1583【数学幽灵】一π胡言 https://www.ximalaya.com/sound/154087427

1597【数学幽灵】e可赛艇 https://www.ximalaya.com/sound/155760579

1601【数学幽灵】φ然成章 https://www.ximalaya.com/sound/157611513

1613【圣者遗物】 爱因斯坦的大脑 https://www.ximalaya.com/sound/160259116

1619【圣者遗物】伽利略的手指 https://www.ximalaya.com/sound/161953082

1621【圣者遗物】爱迪生的最后一口气 https://www.ximalaya.com/sound/163026330

1627【自行脑补】看啥都像脸 https://www.ximalaya.com/sound/164639374

1637【自行脑补】阅片无数心中无码 https://www.ximalaya.com/sound/166379453

1657【自行脑补】汉字序顺并不响阅影读 https://www.ximalaya.com/sound/168422087

1663【两获诺奖】低调的巴丁 https://www.ximalaya.com/sound/169888460

1667【两获诺奖】执着的桑格 https://www.ximalaya.com/sound/170589878

(年少时不轻狂,成名时不张扬,富贵时不挥霍,寂寞时不彷徨,成功时不止步,巅峰时不迷茫)

1669【两获诺奖 】矛盾的鲍林 https://www.ximalaya.com/sound/172240154

1709 【地球有多重】 https://www.ximalaya.com/sound/173799131

1721【宇宙有多重】 https://www.ximalaya.com/sound/175313116

1723【灵魂有多重】 https://www.ximalaya.com/sound/176890808

1733【超级工程】国际空间站 https://www.ximalaya.com/sound/177902393

1741【超级工程】阿波罗登月 https://www.ximalaya.com/sound/179448784

1747【超级工程】曼哈顿原子弹计划 https://www.ximalaya.com/sound/180529003

1753【超级工程】人类基因组计划 https://www.ximalaya.com/sound/181627793

1759【超级工程】大型强子对撞机 https://www.ximalaya.com/sound/182832462

1777【终极之问】我是谁 https://www.ximalaya.com/sound/184301440

网(社会关系),镜(他人/自我评价),肉(身体),灵(性格/记忆)

1783【终极之问】我从哪里来 https://www.ximalaya.com/sound/185986598

族谱/故乡/美食

1787【终极之问】我到哪里去 https://www.ximalaya.com/sound/187089668

高更作品之《我们从哪里来?我们是什么?我们到哪里去?》

“人的一生,要死去三次。第一次,当你的心跳停止,呼吸消逝,你在生物学上被宣告了死亡;第二次,当你下葬,人们穿着黑衣出席你的葬礼,他们宣告,你在这个社会上不复存在,你从人际关系网里消逝,你悄然离去;而第三次死亡,是这个世界上最后一个记得你的人,把你忘记,于是,你就真正地死去。整个宇宙都将不再和你有关。”

——【寻梦环游记】

1789【漫长的实验】太阳黑子 https://www.ximalaya.com/sound/188363602

1801【漫长的实验】天才成长计划 https://www.ximalaya.com/sound/189286299

1811【漫长的实验】明尼苏达饥饿实验 https://www.ximalaya.com/sound/190121147

1823【漫长的实验】永生的海拉细胞 https://www.ximalaya.com/sound/191599788

1831【师徒往事】戴维与法拉第 https://www.ximalaya.com/sound/192831011

1847【师徒往事】原子结构模型 https://www.ximalaya.com/sound/194240815

1861【师徒往事】希腊三贤 https://www.ximalaya.com/sound/195773895

苏格拉底:人,认识你自己

柏拉图:知识是精神食粮

亚里士多德:吾爱吾师,吾更爱真理

1867【杠精】的自我修养 https://www.ximalaya.com/sound/197475163

(各种逻辑错误)

1871【键盘侠】的自我修养 https://www.ximalaya.com/sound/198684018

1879【巨婴】的自我修养 https://www.ximalaya.com/sound/201440488

(这事和你有什么关系,这事和我有什么关系)

(善良也需要有一些锋芒,纯洁也要带有一些智慧)

(每一个巨婴的背后都有一个溺爱他的父母,物质上啃老,精神上没有断奶)

(作为父母,要学会放下,既要放下手,也要放下心)

(你可以陪伴,也可以引导。在重要关头出谋划策,在危难时刻伸出援手)

(没有父母只想把孩子培养成一坨肉,而是要注入灵魂,自由的灵魂,正派的灵魂。也许不那么富有,但是保持阳光,保持独立,保持向上)

(教育家马卡连柯说过,一切都让给孩子,为了他牺牲一切,甚至牺牲自己的幸福,这是父母送给孩子最可怕的礼物)

(教育专家尹建莉/北大才女赵婕说过,我钦佩一种父母,他们在孩子年幼时给予强烈的亲密,又在孩子长大后学会得体的退出。照顾和分离,都是父母在孩子身上必须完成的任务。亲子关系不是一种恒久的占有,而是生命中一场深厚的缘分。我们既不能使孩子感到童年的贫瘠,又不能让孩子觉得成年的窒息。做父母是一场心胸与智慧的远行,不仅仅是做父母,人生许多的时刻都应该懂得进退)

1889【最后悔的发明】塑料 https://www.ximalaya.com/sound/202968899

1901【最后悔的发明】费利克斯·霍夫曼 https://www.ximalaya.com/sound/203892476

(经历了那么多的痛苦,越发觉得,那种激烈的短暂的快乐,已经不适合我了。淡淡的,持久的幸福感才是我想要的)

1907【最后悔的发明】小托马斯•米基利 https://www.ximalaya.com/sound/205348395

1913【西方黑历史】医药学 https://www.ximalaya.com/sound/207102388

1931【西方黑历史】炼金术(上) https://www.ximalaya.com/sound/208446807

1933【西方黑历史】炼金术(下) https://www.ximalaya.com/sound/209517538

1951【西方黑历史】占星术 https://www.ximalaya.com/sound/212054598

1949【番外篇】大学生存指南 https://www.ximalaya.com/sound/210532900

2027【统计学的力量】齐夫定律 https://www.ximalaya.com/sound/228676567

2029【统计学的力量】本福特定律 https://www.ximalaya.com/sound/230438727

2039【统计学的力量】大数定律 https://www.ximalaya.com/sound/233895221

2053【统计学的力量】热手效应 https://www.ximalaya.com/sound/235698380

2028【我还是相信爱情】 https://www.ximalaya.com/sound/231844803

2087【性感是怎样炼成的】比基尼的进化 https://www.ximalaya.com/sound/246494558

2089【性感是怎样炼成的】狂野豹纹 https://www.ximalaya.com/sound/248004412

2090再见理想 https://www.ximalaya.com/sound/250144822

2115【收费节目免费听】 这期节目一定要一口气听完哦,原因你懂的 https://www.ximalaya.com/sound/265659635

2129【人机大战】卡斯帕罗夫VS深蓝 https://www.ximalaya.com/sound/270710111

2131【人机大战】阿尔法狗VS李世石、柯洁 https://www.ximalaya.com/sound/276776162

2137【人机大战】其它那些被忽略的比赛 https://www.ximalaya.com/sound/280284050

2139关于康德哲学思想的一点点理解 https://www.ximalaya.com/sound/286042866

2141怎么活才能更幸福? https://www.ximalaya.com/sound/286892407

2153【干细胞野史】最成功的造假者-皮耶罗•安韦萨 https://www.ximalaya.com/sound/291807712

2161【干细胞野史】最美造假者-小保方晴子 https://www.ximalaya.com/sound/294445538

2179【干细胞野史】最励志的造假者-黄禹锡 https://www.ximalaya.com/sound/297384857

2203【科学圣地】卡文迪许实验室 https://www.ximalaya.com/sound/300161674

2207【科学圣地】贝尔实验室 https://www.ximalaya.com/sound/305098860

2213【科学圣地】欧洲核子研究中心 https://www.ximalaya.com/sound/307027692

2221【科学圣地】病毒的牢笼-P4实验室(赠送大家一期) https://www.ximalaya.com/sound/309112623

2239【韭菜收割机】小霸王,步步高,OPPO,VIVO,也许还有拼多多 https://www.ximalaya.com/sound/313385977

2243【韭菜收割机】巨人汉卡,脑白金,黄金搭档,征途 https://www.ximalaya.com/sound/315124601

2251【韭菜收割机】乐视、法拉第未来 https://www.ximalaya.com/sound/318365241

2253【韭菜收割机(续)】为什么只收你1块钱! https://www.ximalaya.com/sound/321387322

2267【听着就脑袋疼】裂脑人 https://www.ximalaya.com/sound/323772757

2269【听着就脑袋疼】前额叶切除术 https://www.ximalaya.com/sound/325838228

2273【听着就脑袋疼】换头术 https://www.ximalaya.com/sound/327996421

2281【听着就脑袋疼】脑机接口 https://www.ximalaya.com/sound/333400502

2287【谣言心理学】造谣 https://www.ximalaya.com/sound/335676101

2293【谣言心理学】传谣 https://www.ximalaya.com/sound/337785416

2297【谣言心理学】辟谣 https://www.ximalaya.com/sound/337785416

2299【重制】【绝对催眠】【上帝已死】 https://www.ximalaya.com/sound/344346287

2310【答听友问】我要 https://www.ximalaya.com/sound/348579971

2309【科幻法则】机器人三定律 https://www.ximalaya.com/sound/346429457

2311【科幻法则】黑暗森林 https://www.ximalaya.com/sound/350835843

2333【科幻法则】卡尔达肖夫指数 https://www.ximalaya.com/sound/352994796

有两种东西,我对它们的思考越是深沉和持久,它们在我心灵中唤起的惊奇和敬畏就会日新月异,不断增长,这就是我头上的星空和心中的道德定律。

——康德

2340【答听友问】这期节目年轻人听不懂 https://www.ximalaya.com/sound/359626449

2349【答听友问】当科学艰难的攀登到山顶时,发现登山运动员已经在这里守候多时 https://www.ximalaya.com/sound/366187612

我个人感觉,我们整个社会,每一个人都被深深的洗脑了,从小接受的教育,再到整个社会的大环境,一直都在给我们洗脑,我们做的很多事并没真正的听从自己内心的想法。绝大多数的情况,都是为了符合社会的规则,达到父母的期盼,为了家人的预设,所以,越活就越迷茫。特别是关于工作这个事,这成为很多人生活的中心。当然,也没有办法,不工作,就得饿死。但是,我提醒大家,工作,只是生活的一部分,而且,对于我们绝大多数人来说,你是不太可能通过工作,实现财富自由的。所以,尽早做好人生的规划,处理好工作与生活的关系,把工作摆放一个合适的位置,因为,这是你一辈要面对的事。要么改变工作的环境,要么改变自己心态。否则,一定是过的不快乐。

2339【被忽略的那些神人】居维叶 https://www.ximalaya.com/sound/355148322

2341【被忽略的那些神人】亚力山大·冯·洪堡 – https://www.ximalaya.com/sound/361291091

2347【被忽略的那些神人】威廉·冯·洪堡 https://www.ximalaya.com/sound/362895074

2342【番外篇】你的孩子阶级跨越的最大难点在哪?虽然扎心但我还是想说真话(转) https://www.ximalaya.com/sound/361767166

如果要“阶层跨越”这样的梦想总结成一句话,那我想说的是“人生,多做艰难的决定”就对了。

2351【简易装*指南】凡尔赛文学 https://www.ximalaya.com/sound/367763465

2357【简易装*指南】标题党 https://www.ximalaya.com/sound/370730802

2371【简易装*指南】碎片化学习 https://www.ximalaya.com/sound/372910950

千万不要玩游戏的时候还想着学习,学习的时候还想着游戏就行了。

2352【番外篇】如何提一个好问题 https://www.ximalaya.com/sound/369248786

2368捐精和理想主义者 https://www.ximalaya.com/sound/372556040

只有仰望星空,才能找准前进的方面;

只有脚踏实地,才能奔向美好的未来。

2372从科学的角度分析,时间的流逝越来越快了 https://www.ximalaya.com/sound/373473672

要想成为一个优秀的主播,有两个基本素质,你起码得具备其中一个才行。这两素质,一个是,瞪眼说瞎话的能力,你反映得快,思维的敏捷,人家给你提出问题了,不管会与不会,不能冷场。第二种能力是自然的表达,也就是说,不能完全照着稿子去念,面前有稿子,也不能让别人听出来你事先准备好了。

2374真的什么时候开始学习都不晚吗 https://www.ximalaya.com/sound/374109429

三十不改行,四十不学艺,五十不盖房,七十不做衣。

2376烟博士:做科普为什么赚不到钱 https://www.ximalaya.com/sound/374495889

2378高血压糖尿病是不是药企为了赚钱故意治不好的 https://www.ximalaya.com/sound/375471959

你用什么样的眼光去看待这个世界,这个世界就是什么样的。

2379如何评价特朗普 https://www.ximalaya.com/sound/376166348

2380真随机和伪随机 https://www.ximalaya.com/sound/377066403

2382美国可以随便印美元吗 https://www.ximalaya.com/sound/378751697

2382我给你推荐几本书吧 https://www.ximalaya.com/sound/379567742

2377【致命的礼物】马凡综合征 https://www.ximalaya.com/sound/375142500

2381【致命的礼物】阿斯伯格综合征 https://www.ximalaya.com/sound/377612588

2383【致命的礼物】幸存者综合征 https://www.ximalaya.com/sound/380373823

2384我们为什么要探索太空【应该是今年最垃圾的一期】 https://www.ximalaya.com/sound/381107146

2393原来是这些东西让我快乐? https://www.ximalaya.com/sound/382424947

2411量子计算机和量子力学到底有什么关系? https://www.ximalaya.com/sound/383579437

2417为啥美国人不吃大豆? https://www.ximalaya.com/sound/384310542

2423你的大脑只开发了10%吗? https://www.ximalaya.com/sound/384960698

3511 无知即自由,认知即痛苦 https://www.ximalaya.com/sound/456576619

3541 法国的数学咋就这么强呢? https://www.ximalaya.com/sound/460366837

3581德国的哲学咋就这么强 https://www.ximalaya.com/sound/462481197

3613 奥地利的音乐为什么这么强 https://www.ximalaya.com/sound/464497117

3607我好失败呀,我很自卑,怎么办? https://www.ximalaya.com/sound/464492973

3631人类一思考上帝就发笑 https://www.ximalaya.com/sound/466277160

3637你接受进化论吗 https://www.ximalaya.com/sound/466874872

3643到底什么是自由 https://www.ximalaya.com/sound/466000922

3673 我站在有效射程以外,你能打到我吗 https://www.ximalaya.com/sound/470126226

3677富不过三代,是真的吗 https://www.ximalaya.com/sound/470279365

3821 为啥懂得很多道理却依然过不好这一生 https://www.ximalaya.com/sound/475260694

3852 直觉会给你答案吗?(投稿人:二手专家) https://www.ximalaya.com/sound/483062070

3907 言论自由 https://www.ximalaya.com/sound/484825912

6521 人工智能的过去、现在和未来 https://www.ximalaya.com/sound/640088509

7103 我们为什么没有逻辑学这门课 https://www.ximalaya.com/sound/661983894

7829 【长篇】《道德经》 https://www.ximalaya.com/sound/701303257

Posted in Uncategorized | Tagged , , | Leave a comment

Podcast

a16z

Freakonomics

Hidden Brain

Huberman Lab

Lex Fridman Podcast

People I (Mostly) Admire

ReThinking

Something You Should Know

The Knowledge Project

The Michael Shermer

The Next Big Idea

科技要闻评论

思考盒子

科学有故事

生动早咖啡

随口说美国

开言英语

忽左忽右

硅谷101

Posted in 阅读观影赏乐 | Tagged , , , | Leave a comment

Commonly Used Commands

1. Compile Node project

rm -rf yarn.lock node_modules

yarn install –registry “reistry_host_name”

Upgrade node module to specific version

yarn add node-module-client@^1.3.1

2. Eclipse

Open Workspace 

/Users/yli/Projects/userprfl

Build Project (RaptorPlatform – user – errorContent/svc/ib/Test)

/Users/yli/Projects/user/userl

https://www.tutorialspoint.com/eclipse/eclipse_tips_tricks.htm

DISCOVERING SHORTCUT KEYS => Ctrl/Cmd + Shift + L to open a widget that shows all the shortcut keys.

CONTENT ASSIST => Ctrl + Space to see a list of suggested completions. Typing one or more characters before clicking Ctrl + Space will shorten the list.

PARAMETER HINT => Ctrl + Shift + Space to see a list of parameter hints.

Search File => Cmd/Ctrl + Shift + R 

Run Server

Tomcat

3. Git 

git reset (undo)

git checkout -b yli origin/yli (create branch yli to track on origin/yli)

git fetch origin

git branch -a

git remote show origin

git remote

git log –oneline

git reset 7f81f5b6 (back to commit 7f81f5b6)

git pull origin yli12

fork

Pull Request

https://yangsu.github.io/pull-request-tutorial/

https://www.digitalocean.com/community/tutorials/how-to-rebase-and-update-a-pull-request

git checkout -b branch_name –track origin/branch-name

Ex: gco -b rn_st –track upstream/remedy_notification

Git rebase current branch to master

https://stackoverflow.com/questions/7929369/how-to-rebase-local-branch-with-remote-master/18442755

Squash

git reset d43e15

git commit -am ‘new commit name’

https://stackoverflow.com/questions/5189560/squash-my-last-x-commits-together-using-git

https://stackoverflow.com/questions/5667884/how-to-squash-commits-in-git-after-they-have-been-pushed

# Add ‘upstream’ repo to list of remotes

git remote add upstream https://github.com/UPSTREAM-USER/ORIGINAL-PROJECT.git

Skip lint checking when git push

git commit –no-verify

git tag -a v1.5 -m “tag_message_here“

git push origin v1.5

git revert 129114c..84ecdec

git revert 129114c

git reset 129114c

git clean -df (clean uncommit in index tree)

git checkout — . (discard commited in index tree)

git status

git branch -m alias (change branch name)

gp upstream 2020Q4_rollout –force

git cherry-pick 1946587bacae8e9c980b4974f987f3f7410be786

git squash commit to keep commit order

git checkout -b fb –track upstream/master

https://wiki.vip.corp.ebay.com/display/UserRegistration/Steps+to+create+release+feature+branch+in+Git+commands

4. .m2 location

/Users/yli/.m2/

5. Command-Shift-G — Go to Folder window while select

https://www.cnet.com/how-to/finder-shortcuts-every-mac-user-needs-to-know/

6. Check listen port

lsof -i -n -P | grep LISTEN

lsof -n -i TCP:8080

c45041

kill -9 $(lsof -ti:3000)

kill -9 $(lsof -ti:3000,3001)

netstat -vanp tcp | grep 8080

7. Tomcat Server Argument Config (Run Configuration, Arguments)

-Xms512m -Xmx1536m -XX:MaxPermSize=512m -Djava.util.Arrays.useLegacyMergeSort=true

8. Run single test with mocha/JS

https://stackoverflow.com/questions/10832031/how-to-run-a-single-test-with-mocha

node_modules/.bin/mocha src/test-file.spec.js

Jest test single file

node_modules/.bin/jest test/app/page-builders/create/create-user-builder-test.js

9. PR Template

Why
Enable user to cancel the processing request to prevent further operation.

What
It will show cancel title, body text (for warning), button if request is in processing status.(as code section in this PR)

How
Test by pass all unit and integration tests, then verified by end to end test as screenshots in Jira.

10. Local test

/etc/hosts

127.0.0.1       localhost.com

127.0.0.1       lm-123.dev.com

11. Visual Studio Code debug config

        {

            “type”: “node”,

            “request”: “launch”,

            “name”: “Launch PROJECT_NAME”,

            “preLaunchTask”: “transpile”,

            “program”: “${workspaceRoot}/index.js”,

            “cwd”:”${workspaceRoot}”,

            “skipFiles”: [“<node_internals>/**/*.js”],

            “runtimeExecutable”: “/Users/yli/.nvm/versions/node/v16.13.2/bin/node”

        },

12. NodeJS hot refresh (start after code change automatically)

yarn browser-refresh index.js

13. Print circular structure JSON

https://stackoverflow.com/a/18354289/1815612

import * as util from ‘util’ // has no default export

import { inspect } from ‘util’ // or directly

// or 

var util = require(‘util’)

console.log(util.inspect(myObject))

Posted in CS Research&Application, IT | Tagged , , , , | Leave a comment

Build React from Scratch

I read a blog which records the process to write a React like framework from scratch, it is really good experience to learn the concepts in nutshell like this way, I share my notes as following.

1. React uses JSX but it is transpiled as vanilla Javascript in the core, createElement() is utilized to achieve similar effect.

// Using JSX

<div onClick={handleClick}>

  <h1 className="header">Hello</h1>

</div>

// Using createElement()

createElement('div', {onClick: handleClick},

  createElement('h1', {className: 'header'}, 'Hello'));

2. React compares virtual DOM and real one, then update the minimum delta only.

const h = (el, props, …children) => ({el, props, children});

const render = (vnodes, dom) => {

  vnodes = [].concat(vnodes);

  const forceUpdate = () => render(vnodes, dom);

  vnodes.forEach((v, i) => {

    while (typeof v.el === ‘function’) {

      v = v.el(v.props, v.children, forceUpdate);

    }

    const newNode = () => v.el ? document.createElement(v.el) : document.createTextNode(v);

    let node = dom.childNodes[i];

    if (!node || (node.el !== v.el && node.data !== v)) {

       node = dom.insertBefore(newNode(), node);

    }

    if (v.el) {

      node.el = v.el;

      for (let propName in v.props) {

        if (node[propName] !== v.props[propName]) {

          node[propName] = v.props[propName];

        }

      }

      render(v.children, node);

    } else {

      node.data = v;

    }

  });

  for (let c; (c = dom.childNodes[vnodes.length]); ) {

    dom.removeChild(c);

  }

};

// Example

const Header = (props, children) => (

  h(‘h1’, {style: “color: red”}, …children)

);

render(h(Header, {}, ‘Hello’, ‘World’), document.body);

3. React uses global variable to manage stateful components.

let n = 0;

const Counter = (props, children, forceUpdate) => {

  const handleClick = () => {

    n++;

    forceUpdate();

  };

4. React replaces variables with passed in parameters as ES6 syntax.

const x = (strings, ...fields) => {...};

x`Hello, ${user}!`

// strings: ['Hello ', '!'];

// fields: [user]

const Hello = ({onClick}, children) => x`

  <div className=”foo” onclick=${onClick}>

    ${children}

  </div>

`;

render(h(Hello, {onClick: () => {}}, ‘Hello world’), document.body);

Reference

LET’S MAKE THE WORST REACT EVER!

Register in Robinhood which will give free stocks to both you and me, or donate $5 to me for a coffee with PayPal and read more professional and interesting technical blog articles. Follow me @Yaoli0615 at Twitter to get latest tech updates.

Posted in CS Research&Application | Tagged , , | Leave a comment